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How to get to your 
corporate data easily 


If you don't know the first thing 
about SQL, you already know 
the first thing about Quest™ 

Can you use Microsoft® Windows™? Then 
you can use Quest™ Quest’s menus, 
windows and icons make it fast and easy to 
unlock a treasure chest of corporate 
information without any knowledge of SQL. 

Use Quest to access DB2, Oracle® OS/2® 
Extended Edition Database Manager or your 
favorite SQL database server. 

Then, use your newly acquired data 
anyway you want. Prepare reports the easy 
way, with Quest. Crunch the data with Excel 
or Lotus 1-2-3.® Make it look pretty with 
PageMaker.® 

Best of all, you don’t have to do any 
programming. You don’t even need to know 
how to pronounce “SQL.” 

Your Windows and PC LAN 
database connection. 

Quest is the graphical data access tool for 
everyone. There’s nothing better than Quest 
for getting data from mainframes, minicom¬ 
puters or PC LAN servers. But you can even 
use Quest without a server, on your own PC. 

Suppose you want to use records in a 
dBase file for Microsoft Word mail merge. 
Solution: give your assistant Quest. 

Your letters will be on your desk in no 
time, thanks to Quest’s point-and-click 
simplicity. 

Use Quest with confidence. 

Quest is the latest member of the Gupta 
SQL System, which includes SQLWindows® 
SQLNetwork™ and our SQLBase® Server. 

Ask your MIS department to install 
SQLNetwork, giving you safe access to 
corporate data stored in your company’s 
mainframe or minicomputer database. 

Use Quest with confidence. Gupta 
products have won rave reviews for years 
from experts and independent reviewers. 

“Database Tool of the Year” 

LAN Magazine (1989) 

“Best SQL Product” 

Data Based Advisor (1990) 

“Best Front End Tool” 

DBMS Magazine (1990) 

“1990 Byte Award of Distinction” 

Byte (1990) 
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Now, create sophisticated 
reports and queries with Quest. 

Quest is the perfect companion to all your 
SQL applications. If you use database 
programs created with SQLWindows, you can 
use Quest to create ad hoc reports and 
queries without additional programming. 

Take a look for yourself. 

Now you can get a closer look at your 
personal Window to SQL. 

Just send for your free Quest Demonstra¬ 
tion Kit, complete with a Quest demo disk and 
the Quest Solutions booklet. 

You’ll learn how to create reports, graphs 
or tables, and arrange data on your PC any 
way you want. Because with Quest, what you 
see on the screen is what you get. 

To use Quest, all you need is Microsoft 
Windows 3.0, a PC with 2mb memory and a 
hard disk. You can use all of Quest’s powerful 
features on a single, stand alone PC. 

The full Quest program is available from ^ 
your dealer at a suggested retail price of $495. 

Or, to find out what Quest can do for you, call 
us for your free demo kit. The Quest Solutions 
booklet is packed with ways to use Windows 
to unlock your treasure chest of corporate 
information. And the free demo disk lets you 
see just how easy Quest really is to use. 


Free demo kit. 

Includes Quest Solutions booklet and 
demo disk. To order, call toll free. 

1-800-388-4550 ext. 114 
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We open windows to SQL 

Graphical Tools Database Servers SQL Connectivity 

1040 Marsh Road, Menlo Park, CA 94025 
(415) 321-9500, FAX (415) 321-5471, Gupta Europe 44-628-478333 


SQLBase and SQLWindows are registered trademarks and Quest, SQL System and SQLNetwork are trademarks of Gupta Technologies, Inc. Microsoft is a registered trademark, and Windows is a trademark of Microsoft Corporation. Oracle is a registered trademark of 
Oracle Corporation. IBM and OS/2 are registered trademarks of IBM Corporation. Lotus and 1-2-3 are registered trademarks of Lotus Development. PageMaker is a registered trademark of Aldus Corporation. NetWare is a registered trademark of Novell. 
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Well help you see your repository a little differently 


It’s something every kid knows but 
doesn’t like to admit: if you keep your 
toybox organized, you get to spend more 
time playing and less time looking for 
things. The only problem is that keeping a 
neat toybox seems to spoil the fun. 

As a grown-up MIS professional, 
keeping information organized is the key to 
success. And it doesn’t have to spoil the fun. 

That’s where we come in. We’re 
BrownStone Solutions. We offer the most 
complete, intelligently architected, mature 
DB2 Data Dictionary and Administration 
solutions you can get. The kind that take 
care of you and your users. DB2 solutions 
that make your job easier, your people 
more productive and which enjoy the 
largest installed base of any products of 
their kind. 

How the big kids see it. 

Citicorp, Hewlett-Packard Co., Phoenix 
Mutual Life Insurance, Reader’s Digest, 
Pacific Bell, ICI Americas, Hewitt 
Associates and the U.S. Air Force are just a 
few who have chosen to make BrownStone 
an important part of their systems 
development environment. 

They made that decision for some very 
good reasons. 


For example, the DataDictionary/ 
Solution lets you define a methodology 
incorporating the use of a repository along 
with life cycle management, change 
controls, CASE and other tools. You can get 
a wide range of choices and all the benefits 
of a repository-based solution today 

BrownStone’s new release gives you an 
intuitive SAA/CUA compliant dialog. 
Without writing a line of code, you control 
the contents of the pull-down menus as 
well as windows and the help facility to 
guide users through their tasks. 

The big picture. 

Only BrownStone DB2 products make 
these kinds of “workbench controls” an 
inherent part of their architecture. The 
BrownStone DataDictionary/Solution 
understands projects, users, tools 
(including those you create), and an 
unencrypted, extensible E-R based 
repository This lets you automatically 
update menus, enforce dictionary security 
and more. 

In addition, a family of Administration 
Solutions help you implement and manage 
DB2 and IMS databases, and bridge 
information to and from IEW workstations. 


These toolsets are fully integrated into the 
DataDictionary/Solution to take full 
advantage of its power. 

Put your repository to work. 

BrownStone Solutions helps you 
leverage your DB2 investment to make your 
environment more practical, powerful and 
productive—and ready for AD/Cycle. 

Call us now for our free, information- 
packed brochure, “Putting Your Repository 
to Work—plus a free checklist to help you 
evaluate any repository-based product you 
maybe considering. 

Except the one in the picture. 

BrownStone 

SOLUTIONS 

Putting Your Repository to Work. 


BrownStone Solutions 
295 Madison Avenue 
New York, New York 10017 
1 - 800 - 627-7001 

DB2 and AD/Cycle are trademarks of IBM Corp. 
© Copyright 1990 BrownStone Solutions, Inc. 
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These days, it seems that almost everyone has a favorite Graphical User 
Interface (GUI). And most organizations also have non-graphical user 
interfaces running on block mode terminals, character mode terminals and 
PCs. Which usually means that developers must spend months rewriting each 
application for each incompatible system. 

Unless the applications are built with Oracle® Tools. 

An application developed with Oracle Tools automatically adapts to the 
native look and feel of the computer on which it runs. On Sun, IBM, DEC, HP, 
PCs, Macintosh and virtually any other computer. Even on character and block 
mode terminals. All without changing a single line of code. 

Today, Oracle Tools like SQL*Forms and SQL*Menu work with Microsoft 
Windows and Presentation Manager. And they fully support Motif, Open Look, 
Macintosh, block mode and character mode. 

So your applications can be deployed across all the computers in your 
organization. Your users can take full advantage of their GUI without having to 
be re-trained. And your programmers don’t waste time recoding applications 
for each user interface. 

Call us at 1-800-633-0553 Ext. 5756 and you’ll receive the free Oracle Tbols 
Information Kit, illustrating the full capabilities of Oracle Tools. 

It’ll show you how to solve any GUI mess once and for all. 


ORACLE 

Software for people who can’t predict the future. 


©1992 Oracle Corporation. ORACLE, SQL*Forms and SQL*Menu are registered trademarks of Oracle Corporation. Macintosh is a trademark 
of Apple Computer, Inc. Open Look is a trademark of AT&T. Motif is a trademark of Open Software Foundation, Inc. Presentation Manager 
is a trademark of IBM, Inc. Windows is a trademark of Microsoft Corporation. All other trade names referenced are the trademark of the 
respective manufacturer. Call for hardware and software requirements. Outside the U.S.A. but within North America, call 1-800-668-8925 
for product, service, and seminar information. 
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ithout the right tools, it is almost impossible to 
observe what's going on inside DB2. Inside the black 
box. As a result, tasks that should be being handled by 
various maintenance groups are consuming precious 
DBA resources. 

That's why Candle offers a complete set of tools to 
illuminate DB2. Tools that observe problems based on 
your exception thresholds. Isolate the root cause of 
problems. And automatically fix problems or quickly 
recommend solutions. 

Using Candle's tools, day-to-day Catalog 


activities are simplified and automated. Overallocated 
DASD space is reclaimed. Time spent running reorgs and 
image copies is reduced. Poor-performing SQL is isolated 
and explained. And performance and service levels are 
easily monitored. 

By simplifying and automating such time-consuming 
tasks, Candle's tools allow these tasks to be delegated to 
the appropriate groups, freeing the DBA to unleash the 
hidden power of DB2. 

♦Candle 


I -il’MIMlOfti 
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FREE 

ILLUMINATOR KIT 

To become more enlightened about 
how Candle's powerful tools can 
streamline DB2 management and 
administration, send for your free Illuminator Kit today. Complete the 
coupon and mail to Candle Corporation, 1999 Bundy Drive, Los 
Angeles, CA 90025. For faster service call 800 843-3970, Dept 415. 


COMPANY 


CITY 


Copyright © 1991 Candle Corporation. All Rights Reserved 


CIRCLE 3 ON READER SERVICE CARD 


ZIP 

DB2 is a registered trademark of International Business Machines. Inc. 


I c 3 1 V 1 


3 A N 3 A a 0 X 


3 1 V 


3 \ H 3 S 






















EDITOR'S BUFFER 


Client/server computing forces us to reexamine the data-centric approach 


A S HERB EDELSTEIN 
observes this month, 
there's much more to 
downsizing systems 
than kicking out the 
big computers and replacing them 
with smaller ones. Downsizing is 
just the most graphic of the var¬ 
ious phrases (cooperative process¬ 
ing, client/server computing, open 
systems) being used to describe 
the trend toward decentralized, 
distributed computing: something 
that has many MIS shops and their 
favorite proprietary vendors trem¬ 
bling. It has brought us new enter¬ 
prise concepts, such as IBM's In¬ 
formation Warehouse and Digital 
Equipment Corp.'s Network Appli¬ 
cation Support, and has made het¬ 
erogeneous a household word. It is 
also opening up considerable new 
opportunities, hazards, and topics 
to reexamine. 

In fact, the client/server ar¬ 
chitecture may force us to revisit 
the whole notion of a database 
management system. A DBMS, to 
use C.J. Date's definition, is "a 
computerized system whose over¬ 
all purpose is to maintain informa¬ 
tion and to make that information 
available on demand." Date identi¬ 
fies the components of a database 
system as data, hardware, software, 
and users. The multiuser DBMS 
establishes a central point of con¬ 
trol from which to govern the data 
resource requirements of related 
software (report writers, applica¬ 
tion tools, utilities, and so forth) 
plus the other three components. 
Since the relational revolution, the 
emphasis has been on managing 
the data. Software development 
complexity, the exponential in¬ 
crease in data, and the diffusion in 
types of users coming at the data¬ 
base has pushed the DBMS notion 
up to a higher level; and thus, we 
now talk of the repository, a facili¬ 
ty to manage data about the data. 


Return 
of the 
Process 
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But progress towards greater 
centralization is being outpaced 
(and perhaps undercut) by other 
developments. In a distributed en¬ 
vironment, data, hardware, soft¬ 
ware, and users are dispersed. Re¬ 
sponding to the crying need but 
slow pace of distributed database 
solutions, hardware and software 
vendors developed the notion of 
client/server computing. Few ven¬ 
dors still seem to be pursuing a 
truly data-centric solution to dis¬ 
tributed database (the most nota¬ 
ble survivor may be IBM, with its 
Distributed Relational Database 
Architecture). The client/server 
approach increasingly focuses on 
distributing processes. 

The data warehouse, which 
was popularized by IBM last Sep¬ 
tember (and covered extensively 
the past year by Bill Inmon in For 


Managers Only), is a good example 
of the architectures springing up 
to help MIS shops figure out how 
to answer the need for process dis¬ 
tribution, and yet control it. This 
approach splits applications into 
two categories, decision-support 
systems (DSS) and online transac¬ 
tion processing (OLTP). The DSS 
category rests more on the client 
side, while the OLTP applications 
more intensively involve the serv¬ 
er. The distinction highlights the 
two demands tugging the database 
in the opposing directions: busi¬ 
ness flexibility and performance. 

T he client/server 

approach was at first 
hamstrung by the 
lack of "front ends": 
software applications 
designed to interact with a server, 
without requiring the user to write 
a lot of difficult code. But not any¬ 
more; as we see in Richard Finkel- 
stein's article, DBMS-independent 
tools are proliferating the market. 
Attendees to some recent database 
conferences, including the upcom¬ 
ing DB/Expo (San Francisco, March 
23 through 26) could certainly tes¬ 
tify to the explosion of data access 
and retrieval software. Most such 
products now reach the database 
server by hooking into an applica¬ 
tion programming interface pro¬ 
vided by the DBMS vendor (al¬ 
though perhaps someday a stan¬ 
dardized interface will be built to 
industry specifications, such as SQL 
Access Group's). 

As Finkelstein details, the 
front-end/back-end connection is 
still not smooth: facilities overlap, 
access languages are not always 
harmonious, and concurrency and 
integrity issues remain. Problems 
that may not have existed in a 
tight, proprietary relationship be¬ 
tween tool and DBMS suddenly 
threaten. In the months ahead, we 
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will be exploring these issues and 
kinds of products more. But one 
has to consider: As front-end tools 
pull further away from their DBMS 
roots—and as other, more foreign 
types of data access software press 
into the database—is the DBMS 
paradigm itself going to pull apart 
as it moves further from its roots? 

Despite their wayward, data- 
centric leanings, at some point 
DBMSs have always had to make 
peace with their process-oriented 
parents, the operating system, and 
hardware. (And in a networked, 
client/server world, the DBMS 
will also have to acknowledge its 
eccentric uncles and aunts, net¬ 
work operating systems and proto¬ 
cols.) To quote Date again from An 
Introduction to Database Systems , 
"the DBMS has a view of the data¬ 
base as a collection of stored rec¬ 
ords, and that view is supported 
by the file manager; the file man¬ 
ager, in turn, has a view of the 
database as a collection of pages, 
and that view is supported by the 
disk manager; and the disk man¬ 
ager has a view of the disk 'as it 
really is.'" To solve performance 
problems, vendors (such as IBM, 
DEC, and other hardware compan¬ 
ies) have typically integrated their 
DBMS products more tightly with 
the operating system and hard¬ 
ware. Such "bundling," of course, 
has been in part a marketing strat¬ 
egy to make it tougher for inde¬ 
pendent software vendors to loos¬ 
en the hardware vendor's account 
control. 

But even independent DBMS 
software vendors have headed their 
products in the direction of new 
hardware paradigms in pursuit of 
higher performance. Oracle, for 
example, has been making a lot of 
noise about a version of its DBMS 
that takes advantage of scalable, 
massively parallel supercomputers. 
This new version was made possi¬ 
ble by an adjustment in v. 6.2, en¬ 
abling the product to run on loose¬ 
ly coupled VAX clusters. Oracle 
has progressed past that point to 
the massively parallel machine, 
such as the nCUBE 2 Supercom¬ 
puter, which reportedly eliminates 
bottlenecks by using a distributed 
memory structure in which each 
processing node consists of a pro¬ 
cessor and some local memory. 
Thus, multiple database servers 


can run concurrently and indepen¬ 
dently, with their own memory for 
database buffers and backup and 
recovery facilities. 

This development has enabled 
Oracle to exploit symmetric multi¬ 
processor environments offered by 
Sequent Computer Systems, Sun 
Microsystems, and several other 
hardware vendors. Interestingly, 
Oracle's direction is also pushing 
DEC a bit harder toward UNIX 
(that is, Ultrix). DEC VAX sales¬ 
people used to like Oracle (over its 
own DBMS, Rdb) because it helped 
them sell boxes; but now Oracle 
seems to favor other hardware, 
particularly systems running un¬ 
der UNIX. Rdb, a fine but scantly 
marketed DBMS (bundled with the 
VMS operating system, it's given 
away free), is rapidly being ported 
to UNIX. The blood feud between 
Oracle and DEC may be only just 
beginning. 

One other famous solution to 
the database performance problem 
was the database machine, the first 
of which was marketed by Britton- 
Lee (now Sharebase and owned by 
Teradata Corp.). Bob Epstein, the 
founder of Sybase Inc. and a mem¬ 
ber of the University of California, 
Berkeley team that developed In¬ 
gres in the late 1970s, spent some 
time at Britton-Lee. He brought 
some of the database machine ideas 
with him to his own start-up, par¬ 
ticularly the notion of an "intelli¬ 
gent" server that contains stored 
commands or procedures for later 
execution. "We want the database 
to model not only the data, but the 
business," Epstein said in a 1990 
interview. Stored procedures and 
triggers have become fundamental 
concepts of today's client/server 
computing. 

Sybase was also one of the 
first developers to build multitask¬ 
ing capability into the DBMS, in¬ 
dependent of the operating sys¬ 
tem's functions. To describe this 
feature, I would like to quote from 
an excellent November 1991 Stra¬ 
tegic Directive from New Science 
Associates, a consulting and re¬ 
search firm, written by John Gi¬ 
rard (for more information on this 
report, call 203-259-1661): "Sybase 
and its competitors were con¬ 
ceived in the original single¬ 
processor UNIX and VMS environ¬ 
ments. From the start, Sybase 
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How Boriamfe Paradox Is 
Making News in the USA... TODAY 


USA TODAY ® has changed the way Americans 
look at newspapers. As “The Nation’s Newspa¬ 
per™” USA TODAY goes after critical, in-depth 
stories that affect people across the country. 

But because many of those stories start with 
mountains of numbers and statistics, getting the 
facts is a difficult, time-consuming task. And in 
the deadline-oriented newspaper business, there 
is simply no time to waste. 

To solve these complex problems, USA 
TODAY chose Borland’s Paradox.® 

The Dawn of 
“Database Journalism” 

USA TODAY'S Special Projects Unit has the 
daunting task of gathering and analyzing various 
government reports and statistics to get to the 
important stories hidden underneath. 

They call their work “database journalism,” 
and Paradox is the tool of the trade. 

Reporters download information—from 
census figures to campaign finance data to crime 
statistics—into specially created Paradox data¬ 
bases for easy searching and analysis. In many 
cases, Paradox imports these figures directly, 
since it reads dBASE® and Lotus® files. 

The S&L Scoop 

In fact, Paradox and database journalism were key 
figures in USA TODAY'S groundbreaking series of 
reports on the Savings & Loan crisis. 

Early in 1989, USA TODAY'S Special Projects 
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Unit used Paradox to investigate and analyze the 
health of more than 3,000 S&Ls across the 
country. As a result, USA TODAY readers were 
among the first to know about the national 
ramifications of this important story. 

A Messenger Called 
AMANDDA 

USA TODAY has 29 bureaus and regional offices 
nationwide. The paper’s production depends on 
all those offices sharing messages and vital 
information. 

To handle this information flow, a system 
called AMANDDA (for Automated Messaging 
AND Directory Assistance) was created using 
Paradox to work with the corporate electronic 
mail system. This remarkable partnership links 
every USA TODAY office, and provides instant 
access to a variety of information sources, the 
lifeblood of a daily newspaper. 

And with Paradox, AMANDDA was up and 
running in less than two months. 

Just the Facts 

Paradox has all the powerful database features 
you need. Features like superior single- and 
multiuser access. Query By Example (QBE) to 
simplify finding the information you’re looking 
for. Presentation-quality graphics for outstanding 
reports. The turbo-driven VROOMM™ system for 
maximizing memory use. Multi-table forms that 
let you look simultaneously at information from 
several tables. And that’s only the beginning. 
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For developers, Paradox includes the compre¬ 
hensive Paradox Application Language (PAL™). 
PAL puts the power of Paradox at your fingertips, 
and a set of robust development tools completes 
the programming environment. Exactly what 
you’d expect from a world-class database manage¬ 
ment system. 

Some Independent Opinions 

Some of the most respected industry publications 
have also made Paradox their #1 choice. 

“With its combination of speed, ease-of-use, 
and practical features, Paradox [is] an excellent 
value.” — Computerworld 

April 8,1991 

“Paradox is an outstanding product for 
developers and end users. It’s one of the finest 
combinations of programming and interactive 
environments available.” — InfoWorld 

November 12,1990 

Get Paradox. 

And Get the Whole Story. 

Join USA TODAY and thousands of other compa¬ 
nies who keep costs down and profits up with 
Paradox. 

To order Paradox, 
see your dealer today or call 
1-800-331-0877, Dept. 6264 

Copyright© 1991 Borland International, Inc. All rights reserved. Paradox, VROOMM and 
PAL are trademarks of Borland International, Inc. Bl 1445 
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chose to implement a more diffi¬ 
cult but rewarding architecture for 
its access routines, based on 
'threaded' processing. In mid-1985, 
most operating systems did not 
support threaded applications, so 
Sybase wrote its own thread man¬ 
ager. The impact was to allow Sy¬ 
base to run with only one set of its 
access routines [termed an 'en¬ 
gine'] per processor, which served 
all connected users. 

"Threaded code is similar in 
many respects to multitasking code, 
but offers some additional advan¬ 


tages. First, threading allows the 
access engine to run as a single 
process within a single address 
space. This means that user appli¬ 
cations communicate with the data¬ 
base engine through efficient 
shared memory paths rather than 
through more time-consuming in¬ 
terprocess communications mecha¬ 
nisms. Second, locks and conten¬ 
tions were also easier to manage, 
because they were controlled from 
a common, high-speed memory 
cache. Third, and most important, 
is that Sybase is in complete con- 


Desktop DBA 

Serving up database servers... 
Windows style! 



Desktop DBA delivers Windows® 
3.0 front end power to match that 
of SQL database servers. With 
Desktop DBA, you can manage 
any number of database servers 
on your network simultaneously, 
each in its own window. 

Desktop DBA goes beyond even 
the high end utilities available in 
mainframe environments, with 
features like drag-and-drop data¬ 
base migation. Point-and-click 
object management. Automatic 
corruption detection and repair. 


All of this means you don't have 
to accept any more excuses about 
how client/server technology is 
lacking tools for DBAs and 
developers. With Desktop DBA, 
all the pieces are in place. 

Desktop DBA for Microsoft® SQL 
Server™ and SYBASE® available 
now. Desktop DBA for Oracle® 
available early 1992. 

■■■■■ The end 

company 

1926 East Parham Road 
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trol of access routines and process 
behavior because it was not rely¬ 
ing on the less-efficient and gen¬ 
eralized control and I/O routines 
provided in each operating sys¬ 
tem ... The Sybase architecture by¬ 
passes most of the general-purpose 
functions in the operating system, 
substituting its own, optimized, 
reduced-instruction routines. In 
fact, the underlying model is quite 
similar to that of a RISC processor 
design." 

Epstein was a true pioneer of 
the client/server approach; Sybase 
recognized that in order to pro¬ 
vide server intelligence and ade¬ 
quate response to OLTP demands, 
it had to gain control of more of 
the performance knobs and levers 
from the operating system. How¬ 
ever, Sybase has not been immune 
to change. The company had to re¬ 
work the product in 1990-91 so 
that its multithreaded architecture 
would work with symmetric multi¬ 
processing systems. 

Just as client/server comput¬ 
ing offers front-end flexibility and 
independence, it is also shaking 
up the back end. Thus, for OLTP, 
some would suggest that we'll see 
the new class of transaction pro¬ 
cessing machines (such as Tran¬ 
sarc's Encina) compete with data¬ 
base servers. Other start-ups, such 
as Red Brick Systems and UniSQL 
are developing specialized systems 
for better performance. I'm afraid 
whoever suggested that DBMS pro¬ 
ducts were now commodities was 
far too premature. 

W E'RE OPENING AN- 
other communica¬ 
tions avenue to our 
readers: a comment 
line. Thanks to the 
powers of voicemail, we have set 
up an extension for those who 
would like to leave us a brief mes¬ 
sage regarding articles and col¬ 
umns we've run, topics or prod¬ 
ucts we should be covering, or 
even industry tidbits (rumors, hot 
leads, that kind of thing). 

Now, the busy reader who 
hasn't the time to write can give 
an "instant" comment. The phone 
number is: (415) 905-2785. We look 
forward to your recording! 1111 

David Stodder 
Editor 
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Now the SAS/C® compiler is 
the perfect choice for 
CICS applications, too. 


Now you can develop command level CICS 
applications with the industry’s leading main¬ 
frame C compiler, the SAS/C® compiler from 
SAS Institute. Running under MVS or VM/CMS, 
the SAS/C compiler and CICS preprocessor can be 
used to develop transaction-oriented applications 
for CICS (Release 1.7 or above) in MVS, 
MVS/ESA, and DOS/VSE environments. 

With the SAS/C compiler’s command preprocessor, 
all CICS commands, including DL/I and BMS, can 
be freely interspersed with C code. And that 
includes support for commands that other C 
compilers simply can’t handle (such as HANDLE 
CONDITION, IGNORE CONDITION, 
PUSH/POP HANDLE and HANDLE AID). 

File I/O for CICS transient data queues (both 
extrapartition and intrapartition) and JES spool 
files are fully supported through standard C library 
functions. Run-time storage analysis and usage 
reports are also available. And with our exclusive 
support for JES spool files, you can retrieve a file 
from the JES spool, write a file directly to the JES 
spool, and send a file to a remote destination via 
systems connected to a JES/RSCS network. 


CICS Systems Programming Too! 


The SAS/C compiler set the standard for systems 
programming in C. And now our exclusive Systems 
Programming Environment makes easy work of 
writing CICS exits and utilities as well. 

With frequent updates and knowledgeable technical 
support-both provided free-the SAS/C compiler is 
the best investment you can make toward more 
effective and maintainable CICS applications. 


Learn More With a Free 
Programmer’s Report 


Our free Programmer’s 
Report: Developing CICS 
Applications in C provides an 
excellent introduction. To 
receive a copy, simply mail 
or FAX the coupon below. 

Or call us right now to find 
out how you can evaluate 
the SAS/C compiler free 
for 30 days. ^ 
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diagnostic control, inline machine code, and 
dynamic loading also work under CICS. And for 
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handling, full tracebacks, and a complete set of 
warning/error messages. 

The SAS/C transient library is freely redistributable 
with your CICS applications. There are no hidden 
run-time fees. Or you can use our exclusive 
All-Resident Library to automatically link the 
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application and eliminate the need for a transient 
library altogether. 
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Our data 
dictionary 
organizes 
even better 
than she does. 

With IBM’s AD/Cycle on 
the horizon, getting your 
meta data in order is long 
overdue. Fortunately, 

DB EXCEL,® the innovative 
family of repository-based 
tools to support the IBM SAA 
development and maintenance 
environment, makes it easier 
than ever to collect and main¬ 
tain information about your 
enterprise systems. 

DB EXCEL is a DB2-based 
data dictionary with a long 
list of unique features that 
makes it the ideal way to pre¬ 
pare for a successful migra¬ 
tion to IBM’s Repository and 
AD/Cycle. DB EXCEL sup¬ 
ports both DB2 and IMS. It 
offers bidirectional interfaces 
to popular CASE tools like 
Bachman, KnowledgeWare’s 
IEW/ADW and Intersolv's 
Excelerator, for leveraging 
your investment in those 
technologies. 



Everything about 
DB EXCEL is designed to 
make standards implementa¬ 
tion easy. Customization is 
simple, thanks to DB EXCEL’s 
automated extensibility. And 
the PC-like pop-up windows 
get users defining and 
maintaining data entities 
quickly. Even repository 
editing is a breeze, with a 
powerful edit facility 
that supports editing 
multiple entities on 
one screen (mass 
edit) or sequential 
editing of a list of entities 
(queue edit). 

As an IBM Business 
Partner, we made sure that 
DB EXCEL’s flexible, dynamic 
architecture fits in with IBM’s 
plans for AD/Cycle. DB EXCEL 


can allow smooth imple¬ 
mentation of RM/MVS in 
the future, and can function 
as an integral part of that 
solution once it has been 
installed, so your investment 
in DB EXCEL is protected. 

To make using DB EXCEL 
even simpler, we back it with 
superior service, from hands- 
on installation assistance to 


detailed, quality manuals 
and documentation. 

Check out DB EXCEL for 
yourself—with our 3 0-day 
free trial. 

Call 1-800-333-4899 to 
arrange for an on-site demo. 


mm 

CHARTER MEMBER 


K3ELTECH 

Suite 450 • 3211 Jermantown Road 
Fairfax, Virginia 22030-2844 

DB EXCEL is a registered trademark of Reltech Products. All other trademarks are proprietary to their respective manufacturers. 
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CIRCLE 6 ON READER SERVICE CARD 












































ACCESS PATH 


Lauds and lectures , kudos and critiques 


DEAR EDITOR: 

I would like to respond to Barbara 
von Halle's April 1991 column 
("The Year in Review," Database 
Design) in which she says, "The 
migration from application-oriented 
(narrow-scoped) databases to shared 
databases is as much a cultural 
change as a technological one." 
Actually, I think the change is 
more cultural than technological. 
Most systems-development depart¬ 
ments and senior-level managers 
are rooted in the '70s and believe 
that data sharing attacks the cul¬ 
ture that has built up over the last 
20 years. Any data administration 
manager trying to foster a shared- 
data environment should keep a 
resume handy and remember Ma- 
chiavelli: "It must be considered 
that there is nothing more diffi¬ 
cult to carry out, nor more doubt¬ 
ful of success, nor more dangerous 
to handle, than to initiate a new 
order of things" ( The Prince , Ox¬ 
ford University Press, 1979). 

I recently attempted to edu¬ 
cate my boss and his colleagues on 
the shared-data environment's im¬ 
portance and steps for its imple¬ 
mentation. However, under the 
guise of a "relative importance fac¬ 
tor," my position as manager of 
data administration was eliminat¬ 
ed and I was "let go." Of course, 
the culture was saved. 

Has any other company had 
success implementing data sharing? 

Keith C. Hulslander 
Dover, Pennsylvania 

Columnist Barbara von Halle responds: 
Hulslander's situation is a sad one and 
I tried to solicit some immediate an¬ 
swers to his letter. 

Regarding data sharing, Deborah 
L. Brooks, data specialist at a Wash¬ 
ington D.C.-based corporation, said, 
"The cultural change toward data 
sharing is slow but sure. With in¬ 
creased pressure from domestic and 


We would like to hear from you! 
If you have suggestions or com¬ 
ments about our publication, 
write: Editor, DATABASE PRO¬ 
GRAMMING & DESIGN , 600 
Harrison St., San Francisco, CA 
94107. 


overseas competitors, American corpo¬ 
rations are forced to define their cor¬ 
porate strategies clearly. With this 
shift in focus to corporate-level re¬ 
sources, it's beginning to be recognized 
that data transcends organizational 
boundaries." Brooks believes the days 
of individual fiefdoms with closely guar¬ 
ded information stores are numbered. 

Carolyn Kohler, manager of data 
design coordination at Levi Strauss & 
Co. in San Francisco, said that data 
sharing has become a reality in her 
company, and agreed that the change 
is more cultural than technological. 
She explained, "Levi Strauss & Co. 
embarked on the path of data integra¬ 
tion a few years ago and has succeeded 
in building an enterprise model and in¬ 
volving the users in the gathering of 
data requirements and implementation 
of five shared subject area databases." 

Kohler believes that Levi Strauss's 
data sharing success was due to the 
support and commitment from Infor¬ 
mation Resources management (from 
the CIO down) as well as the user 
community, and a companywide 'Til 
try it" attitude. 

DEAR EDITOR: 

Database Programming & Design is 
an excellent, professional product 
that achieves industrywide recog¬ 
nition. By far its weakest attribute, 
however, is the Celko on SQL col¬ 
umn. Celko presents a naive, flip¬ 
pant attitude and viewpoint on 
most issues and isn't consistent 
with the quality of your other au¬ 
thors. In particular. I'd like to re¬ 
spond to Celko's views on down¬ 


sizing ("On Top of the World," 
Celko on SQL, August 1991). 

Celko states that "now down¬ 
sizing means replacing large main¬ 
frame computers with networked 
PCs." Although I don't disagree 
that downsizing is a trend in the 
industry, defining downsizing as a 
replacement of large mainframes 
with networked PCs is inaccurate. 
As Celko suggests, some applica¬ 
tions could downsize to networked 
PCs, but not the majority of appli¬ 
cations running on large mainframe 
computers. In addition, his col¬ 
umn suggests that downsizing to 
PCs is easy and inexpensive. This 
attitude is naive and requires fur¬ 
ther elaboration. 

Although I'm not suggesting 
that I know the true definition of 
downsizing, I would like to distin¬ 
guish between a downsized envi¬ 
ronment and a large mainframe 
environment. A downsized envi¬ 
ronment would most likely have 
the following characteristics: 

□ A greater number of inde¬ 
pendent computers; 

□ Larger physical distribu¬ 
tion of computers; 

□ More sophisticated network 
and application software, system/ 
configuration management, and 
professionals; 

□ Higher initial as well as 
sustaining costs; 

□ Greater user productivity 
(not cost savings). 

The large mainframe com¬ 
puter environments migrating to a 
downsized environment will like¬ 
ly view the mainframe as a data 
and information server for the en¬ 
terprise. Therefore, some portion 
of the mainframe complex will re¬ 
main part of the overall computer 
infrastructure. 

A properly selected migra¬ 
tion path from large mainframe 
computers should result in a more 
productive environment for users. 
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but probably won't result in less 
actual computing costs. As the pre¬ 
ceding list implies, more costs are 
associated with downsizing than 
cost-avoidance with cheaper MIPS 
on the desktop. As with any deci¬ 
sion, the business case drives the 
solution. 

Michael Krasowski 
Mission Viejo, California 

Columnist Joe Celko responds: I have 
over 200 publication credits in the 
computer trade press , so I think I have 
my column down pretty well by now. 


In addition, I've managed to earn a 
decent living as a consultant, which 
isn't exactly a job for the naive. 

Krasowski's remark that the ma¬ 
jority of programs on mainframes 
can't be moved down to smaller ma¬ 
chines is interesting. Can he prove it? 
I'll give a counter example. Dr. George 
Schussel of Digital Consulting Inc. has 
a client who moved a 130,000-line 
COBOL program from an IBM main¬ 
frame to Micro Focus's COBOL. This 
conversion required only 12 changes in 
the source code and now runs faster. 

As far as the mainframe acting 


as a file server, I have the same doubts 
as Dr. Schussel. From my own experi¬ 
ence, 1 can give Krasowski details on 
large- to medium-sized Canadian banks 
that work on networked PCs without a 
single mainframe. 

I wonder why Krasowski as¬ 
sumes that a downsized environment 
will have a greater number of indepen¬ 
dent computers and larger physical 
distribution. I look at traditional main¬ 
frame shops today and every PC is 
used as a word processor and spread¬ 
sheet machine. Does this mean a shop 
is downsizing if everyone owns a pock¬ 
et calculator? I don't think so. The de¬ 
fining characteristic is access to corpo¬ 
rate data resources, which implies 
some type of network. 

I'm not sure what Krasowski 
means by "sophisticated," but I'll as¬ 
sume it means complex or expensive. 
Indeed, I hope we'll see more sophisti¬ 
cated network systems, where systems 
do all of the hard work so that we 
don't have to hire sophisticated profes¬ 
sionals. I have high hopes that the deal 
between Digital Research and Novell 
will produce a high-quality, integrated 
network operating system that blows 
the socks off those we have now, which 
are weak on recovery, access, logging, 
and security. Traditionally, these sys¬ 
tems were tacked on top of existing 
single-user operating systems, not in¬ 
tegrated into them. 

I expect to see more application 
packages on smaller machines as time 
goes on, which will probably be ported 
down from mainframes. When I can 
get the same commercial package on a 
PC as a mainframe, I'll know the main¬ 
frame has died completely. 

A pattern seems to exist in com¬ 
puting that makes life easier for the 
user. When mainframes were first de¬ 
veloped, users had to write custom 
software for every application. When 
minicomputers came out, the same thing 
occurred. Why should networked PCs 
be any different? 

I don't understand Krasowski's 
idea that greater productivity won't 
result in cost reduction. If we assume 
that productivity is measured in trans¬ 
actions per day, the only way that a 
greater number of transactions per day 
could result with the same cost per 
transaction would be to increase other 
costs. And does Krasowski think that 
maintenance costs for PCs is higher 
than for mainframes? I wonder how 
many PCs I can buy and service for 
the monthly rental on one mainframe? 
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DB2 buffer pools: 
sink or swim 



You can drown in wasted buffer pool 
space or run aground if you don't 
adjust your buffer pool sizes based 
on workload needs. But you can't 
afford the downtime associated with 
changing buffer pool sizes, so you 
do what you can to stay afloat. 


Announcing 

OPERTUNE 


Now for the first time, you can 
achieve continuous availability while 
changing buffer pool sizes. This 
capability is available only with 
OPERTUNE™ for DB2* from BMC 
Software. 


When peak processing times 
require larger buffer pools, you can 
increase the sizes to meet your 
needs, reducing bottlenecks and 
speeding response times. Likewise, 
when requirements call for reduced 
buffer pools, you can quickly and 
easily modify them, saving storage 
space. 

OPERTUNE lets you achieve 
optimum performance while main¬ 
taining continuous data availability 
by allowing you to dynamically tune 
for your changing DB2 needs. 


Dynamically changing buffer 
pool sizes is achieved with BMC 
Software's exclusive technology. It 
uses DB2 to implement many 
changes and requires absolutely no 
USERMODs or dynamic hooks. 


Take control 


Achieve continuous availability 
while tuning for optimum perform¬ 
ance. For more information about 
changing DB2 buffer pool require¬ 
ments or any of the additional 
OPERTUNE features, contact BMC 
Software today at 713 240-8800 or 
1 800 841-2031 



SOFTWARE 


The Experience. The Technology. The Future. 

BMC Software international offices are located in Australia, Canada, Denmark, France, Germany, Italy, Japan, Spain and the United Kingdom. 
*DB2 is a registered trademark of International Business Machines Corp. ©1992, BMC Software, Inc. All rights reserved. 
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Taming DB2 with TABLES 

Improve application development productivity 20- 50 times. 


Prototype and maintain DB2 
tables in production CICS and IMS 
without programming... 
distribute production table 
maintenance to end users... 
bring your table applications 
on-line by as much as 50 times 
faster than using CASE tools... 
And do this with the full 
confidence of SAA compliance.* 

Maintenance screens can be 
created by simply naming a table. 
Editing, validation, and 
processing logic, if needed, can 


The Power To 
Make it Simple 

■ Applications up to 50 times faster 
than CASE tools 

■ Production on-line maintenance 

■ Static SQL 

■ Common memory dataspace 
five times faster than DB2 buffers 

■ Time dependent row processing 

■ Menus and help screens 


be added using TABLES custom 
development facilities. 

For ultra high performance, 

(5x faster than DB2 buffers) tables 
can be loaded into common 
memory dataspaces and 
concurrently shared across 
multiple regions including: CICS, 
IMS, TSO and batch. 

For a free trial or to learn 
more about how TABLES is 
bringing new meaning to the 
term productivity. 


* Registered in IBM's SAA Catalog 


/ IT 


W SPECIALIZED 


SEE SOFTWARE 


INTERNATIONAL 

THE TABLE MANAGEMENT COMPANY™ 


call (800) 328-2825 


DB2 is a trademark of the IBM Corporation 
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DATABASE DESIGN 


The data-sharing lesson is tough to learn , but it's a worthwhile exercise 


H ave you ever tried 

to teach a child the 
value of sharing? The 
first lesson usually 
proceeds along these 
lines: "You should share your pos¬ 
sessions with your siblings, friends, 
and even people who aren't your 
friends because it's a nice thing to 
do—it makes the other person feel 
good." Then, it progresses to: "No, 
I won't buy another one—you can 
both use the same one ... Why? Be¬ 
cause I said so!" And finally: "If 
you don't share it. I'll take it away, 
and no one will have it!" With les¬ 
sons like this one, no wonder our 
perceptions of sharing are some¬ 
what less than optimal. 

My six year-old son has de¬ 
veloped an interesting strategy for 
sharing his Teenage Mutant Ninja 
Turtles with his four year-old broth¬ 
er. He bases it on four carefully 
constructed "sharing principles." 
First, he and his brother never play 
with the same turtle at the same 
time. Second, his brother "shares" 
only one turtle at a time. Third, his 
brother can "share" a turtle only 
when the older one isn't playing 
(or even thinking of playing) with 
it. Fourth, and most importantly, 
the turtles aren't community prop¬ 
erty that can be leveraged for the 
good of both siblings. In fact, the 
younger sibling sacrifices all shar¬ 
ing rights the moment he refers to 
them as "our turtles." 

Obviously, most of us never 
want to share anything. But it's in¬ 
teresting that a lot of us have cho¬ 
sen a career in which "wide-scale 
sharing" is the ultimate goal. How¬ 
ever, what should we do when fa¬ 
miliar tactics, such as removing the 
"shareable object" or refusing to 
duplicate it, don't work in the cor¬ 
porate world? 

For starters, we have to find 
more civilized data-sharing strate¬ 
gies. Two months ago, we began 


BY BARBARA VON HALLE 

Share 
and Share 
Alike 

developing such a strategy using 
John Zachman's Information Sys¬ 
tems Architecture Framework (ISA 
Framework). The framework aims 
at determining an architecture for 
a single information system. We 
extended this scope to include our 
entire data-sharing universe, which, 
by definition, encompasses many 
information systems. 

Since last month, we've been 
suspended in the second row of 
the data column labeled Business 
Data Model. We arrived here by 
solidifying the data-sharing scope, 
organizing the business communi¬ 
ty around this scope, and deter¬ 
mining the specific purpose for de¬ 
veloping the business data model. 
We also accepted Zachman's sug¬ 
gestion that this cell results in 
business entities and relationships. 
However, alternatives such as ob¬ 
ject and Nijssen's Information Anal¬ 
ysis Methodology (NIAM) models 
are possible as well. 

While poised in this cell, we 
questioned what its deliverables 
look like. We proposed two kinds 
of deliverables: a conceptual, plan¬ 
ning-level business data model 
and a detailed, working-level busi¬ 
ness data model. This cell prob¬ 
ably wasn't meant to house these 
two data models. Rather, Zachman 
and others place the detailed busi¬ 
ness data model in the next row of 
the framework labeled Information 
Systems Model. 

This month, we'll explore the 


conceptual and detailed business 
data models, and discuss where 
each may fit in the ISA Framework 
and our evolving data-sharing 
strategy. 

A CONCEPTUAL Bus¬ 
iness data model de¬ 
picts high-level enti¬ 
ties; however, it some¬ 
times represents data 
groups or data areas rather than 
normalized entities. A conceptual 
data model for a large enterprise 
may consist of 10 to 30 entities, 
such as product , customer , external 
organization , order , and so on. 

The conceptual business data 
model may also contain many-to- 
many relationships without re¬ 
solving them into one-to-many re¬ 
lationships. In addition, it may 
include the most important sub- 
types without identifying all sub- 
types. And in some cases, relation¬ 
ships may not have names. 

To validate the conceptual 
data model, we present it or some 
of its aspects to the business com¬ 
munity. Therefore, we need to de¬ 
termine how well businesspeople 
will understand the data model. In 
some enterprises, higher business 
authorities may never view the 
data model diagram, and may only 
focus on the business definitions 
underlying them. In these in¬ 
stances, they approve business def¬ 
initions for entities, business own¬ 
ers for these entities, and identifiers 
for entities that don't already have 
appropriate identifiers. The empha¬ 
sis is on ensuring the most com¬ 
fortable level of supertypes and 
subtypes possible. 

In other enterprises, higher 
business authorities participate in 
a structured walk-through of the 
data model diagram. Their mem¬ 
bers can perform this walk-through 
as a group or individually. Indi¬ 
vidually, the reviewer can concen- 
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trate on those areas "owned" or 
"accessed" by the business area of 
immediate interest. 

As we saw last month, the 
next step in our data-sharing strat¬ 
egy is to develop a detailed busi¬ 
ness data model. This model should 
be a fully attributed, normalized, 
keyed, and subtyped data model 
containing essential business in¬ 
formation. A better name for this 
model is a detailed (not conceptu¬ 
al), business-oriented (not infor¬ 
mation systems-oriented) data (not 
process) model. In fact, a particular 
detailed data model may represent 
a smaller scope than the conceptu¬ 
al business data model. 

A lot of controversy exists over 
where this kind of data model fits 
into the ISA Framework. Even 
speakers at the Zachman Frame¬ 
work for Information Systems Ar¬ 
chitecture Conference (Arlington, 
Virginia, March 25 to 27, 1991) 
didn't agree on whether attributes 
and keys first appear on the busi¬ 
ness model, information systems 
model, or technology model rows. 
Questions arose as to which prop¬ 
erties of entities, relationships, 
and attributes are business-driven 
and which are information systems- 
driven. 

To answer these questions, we 
should determine the audience. 
How much of this model does the 
business community need to be 
aware of, and how much of it con¬ 
cerns only information systems pro¬ 
fessionals? Because we build this 
model according to concepts that 
originated with the relational mod¬ 
el, some may insist that it's an in¬ 
formation systems data model. 
However, remember the important 
role of the business community in 
data sharing. The detailed, busi¬ 
ness-oriented data model repre¬ 
sents, in an unambiguous way, the 
information essential to running a 
business, not a system. An infor¬ 
mation systems data model, on the 
other hand, should include the data 
needed to build a computerized 
system, including codes, process¬ 
ing flags, and other information 
systems or applications-specific 
data structures. 

Regardless of where it fits in 
the framework, we must still de¬ 
velop and gain approval for a de¬ 
tailed, business-oriented data mod¬ 
el. It isn't likely that the higher 


business authority will review a 
detailed data model or detailed 
business rules; rather, approval for 
the detailed business data model 
will probably involve intermedi¬ 
ate business management and day- 
to-day model participants. The 
business data model review ses¬ 
sions may also include business 
analysts and database designers. 

With this understanding in 
mind, let's investigate the first of 
many controversial questions sur¬ 
rounding the content of a detailed 
business-oriented data model. As 
always, I welcome reader input. 

Question one: Should enti¬ 
ties in the detailed', business- 
oriented data model be fully nor¬ 
malized? The detailed, business- 
oriented model should represent 
the essential business information 
in a way that encourages data shar¬ 
ing. Therefore, it requires disci¬ 
pline—such as normalization, good 
primary identifiers, and subtype 
and supertyping. In fact, a de¬ 
tailed, business-oriented data model 
includes resolution of many-to- 
many relationships (not to imply a 
particular information systems per¬ 
spective, but to uncover more de¬ 
tailed business rules about the 
data). Often, it's useful to present 
such details to businesspeople. 

Question two: Should arbi¬ 
trary, artificial entity identifiers 
be included in the detailed busi¬ 
ness-oriented data model? Most 
of us will agree on one surprising 
fact: The most important entities 
in an enterprise's business don't 
have definitions that are common¬ 
ly understood by everyone in the 
business community. It's even more 
likely that the most important 
business entities don't have ade¬ 
quate identifiers. As Theresa Whit- 
ener suggested in her article enti¬ 
tled "Primary Identifiers: The Basics 
of Database Stability" ( Database 
Programming & Design , January 
1989), a good identifier is: 

□ Unique for each object 
instance; 

□ A definite value for every 
object instance; 

□ Explicit when an object in¬ 
stance is created; 

□ Only one consistent value 
for every object instance; 

□ Unchangeable; 

□ Not fact-giving data; 

□ Legal to implement; 


□ Fully controllable by the 
enterprise; 

□ Visible to and accessible 
by users. 

Most of us will agree that a 
good solution (and current indus¬ 
try trend) is the introduction of 
new, arbitrary, and nonintelligent 
identifiers. The question is wheth¬ 
er these "imaginary keys" should 
be part of the business-oriented 
data model. Many people believe 
that these keys are "database keys" 
and therefore shouldn't be used for 
business consumption. However, I 
agree with Whitener that such iden¬ 
tifiers should become visible to 
the business community whenever 
possible because the business needs 
them to share data effectively 
through a common, stable identifi¬ 
cation scheme. Therefore, we should 
include the identifiers in the busi¬ 
ness-oriented data model. 

Because people react negative¬ 
ly to such identifiers, we can use 
the higher business authority to 
introduce them in a politically pal¬ 
atable way. We should educate the 
higher business authority in the 
properties of good identifiers as 
well as the present inadequacies of 
current identifiers. We should also 
gain its approval for any new iden¬ 
tifiers. In fact, as we discussed ear¬ 
lier, we can include a few of these 
identifiers in the conceptual data 
model. 

Question three: Should de¬ 
code entities be included in the 
detailed, business-oriented data 
model? Decode entities are "look¬ 
up" entities that transcribe an ab¬ 
breviated code into its full busi¬ 
ness meaning. An example is an 
entity called employee type that 
translates EMPLOYEE TYPE CODE 
(p or f) into employee types, part- 
and full-time. 

The existence of decode enti¬ 
ties in business data models can 
result in a lot of emotional discus¬ 
sion. From a purely business per¬ 
spective, such an entity is super¬ 
fluous to the business community, 
even though the community is ac¬ 
customed to the codes. In reality, 
the code isn't essential to running 
the business. The business commu¬ 
nity merely needs to distinguish 
between the part- and full-time 
employees. 

I've met at least two people 
who include decode entities in the 
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Logic Works ERwin /ERX 

THE PRODUCT EVERYONE IS TALKING ABOUT. 


Here's What the Experts Say: 

"Maybe, someday, all databases will 
be designed this way." 

-DBMS, November 1990 


Here's What We So 


Tom Bruce is talking about ERwin! 

Why has ERwin become the design tool of choice? 

Tom Bruce’s new book, Designing Quality Databases, 
explains the IDEF1X method, with real-world examples 
featuring ERwin as the design tool that makes it happen. 

We think Tom Bruce’s book will help you get even more 
from ERwin tools. So, for a limited time, we’re including it 
FREE with every ERwin/ERX purchase. 

□ Tell me more about ERwin and Designing Quality 
Databases with IDEF1XInformation Models. 

Name_ 

Title_ 

Company_ 

Address_ 

City_State_ 

Phone_ 



For faster response, call Logic Works today, at (609) 683-0054 or FAX (609) 924-0029. 



ORACLE 
SQLBase 
SQL Server and 
SYBASE 


logic® 
works 


Logic Works 

601 Ewing Street, Suite B7 
Princeton, New Jersey 08540 

(609) 683-0054 •Fax (609) 924-0029 


v 


ERwin Photo Credits: Michael Carr, Cynthia Engler. lERwtn Diagram: Robert G. Brown. 
Logic Works is a registered trademark and ERwin is q trademark of Logic Works. 
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Logic Works ER win /ERX 

THE PRODUCT EVERYONE IS TALKING ABOUT. 

Here's What the Experts Say: 

"Maybe, someday, all databases will 
be designed this way." 

-DBMS, November 1990 

Here's What We Say: 

The experts are right. Everyday more and more 
Big 8* databases are being designed with ERwin's 
logical data models and SQL schema generation. 

But why wait tor "someday" when you can 
start designing your Big 8 database applications 
the right way today? What everyone is saying 
is true. 

No other CASE tool makes capturing business 
rules, generating SQL DDL, and reverse¬ 
engineering your existing Big 8 databases into 
ER diagrams this easy. 

Call Logic Works today! For a limited 
time, ERwin/ERX, the product everyone is 
talking about, is available for only $ 199 ^ 

ERwin/ERX supports the Big a 1 ?* 11 . .11 

• DB2 

• Informix 

• Ingres 

• Netware SQL 

• ORACLE 

• SQLBase 

• SQL Server and 

• SYBASE 




logic® 

■ j works 


Logic Works 

601 Ewing Street, Suite B7 
Princeton, New Jersey 08540 

(609) 683-0054 •Fax (609) 924-0029 


V 



ERwin Photo Credits: Michael Carr, Cynthia Engier.! ERwin Diagram: Robert G. Brown. 
Logic Works is a registered trademark and ERwin is a trademark of Logic Works. 
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WHAT HAPPENED AT 
LAST WEEK’S 
ORACLE USERS 
GROUP MEETING? 

Users groups are becoming an increasingly promi¬ 
nent vehicle for the flow of technical tips and infor¬ 
mation about DBMSs, CASE, and related software. 

Therefore, we are looking for members of users 
groups to write about their groups’ experiences for 
Database Programming & Design’s Birds of a 
Feather column. 

The column covers a variety of issues, from a tech¬ 
nical and nontechnical standpoint. Future issues 
will include such topics as: 

• Profiles of particular users groups: industries 
represented, levels of experience 

• Hot ideas, problems, possibilities, and 
technical tips that come up during meetings 

• Organizational issues and problem solving 

• How to get the best speakers 

• Approaching vendors from a nonmarketing 
standpoint 

• Calendars of events 

• Reports of recent meetings 

• Making recommendations to and getting feed¬ 
back from vendors 

• How to use user groups as vehicles for change 

If you would like to share your experiences with 
your users group—no matter how big or how small— 
we want to hear from you! 

Theresa Rigney 

Database Programming & Design 
600 Harrison St. 

San Francisco, CA 94107 
(415) 905-2482 


information systems or technology 
data model, rather than the busi¬ 
ness-oriented model. This philos¬ 
ophy is especially true when the 
decision to use codes is made to 
save space on DASD or screens. 
You may include these codes in 
computerized systems to satisfy 
business demand for them. Even 
so, excluding them from the busi¬ 
ness-oriented data model rein¬ 
forces that they're unessential. 


O UR ROLE AS INFOR- 
mation specialists is 
an opportunistic one. 
We alone may have 
the vision, immediate 
need, and methods for integrating 
the enterprise. If we do our jobs 
well, we'll be the enablers of com¬ 
mon organizational semantics and 
business policies. 

We can use the ISA Frame¬ 
work as a guide for developing a 
data-sharing strategy. We have be¬ 
gun such a strategy over the past 
few months. It probably doesn't 
matter whether we place the de¬ 
tailed business data model in the 
second or third row of the frame¬ 
work, but we must make a distinc¬ 
tion between a business-oriented 
data model and an information- 
systems-oriented data model. We 
should use the former to under¬ 
stand the business, and the latter 
to design databases for informa¬ 
tion systems. If, for whatever rea¬ 
son, we combine aspects of both, 
we should at least indicate which 
aspects are business-driven. 

Regardless of our strategy, 
we should always be aware that 
sharing isn't necessarily universal 
in its definition. We must lead the 
business community carefully 
through the lesson of enterprise¬ 
wide sharing. It's a tough lesson, 
however, because multiple owners 
must often share a common asset. 1111 
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INFOSCOPE polishes 
DB2 quality and performance 
—automatically 


At last there’s a way to guarantee 
the quality, security and per¬ 
formance of DB2 applications. 
You can do it by using InfoTel’s 
INFOSCOPE for DB2 to limit 
deviations from your established 
programming standards. 

INFOSCOPE for DB2 ensures 
quality by enforcing preestab¬ 
lished programming standards 
via an ISPF interface. 

INFOSCOPE for DB2 ensures 
security by barring the use of 
statements that are dangerous 
or prohibited under your 
standards. 

IN FOSCOPE for DB2 ensures 
performance of new or existing 
DB2 applications by demanding 
well-written SQL statements, 
efficient tablespace scans, 


systematic memory strategies 
and straightforward access 
paths that minimize contention 
for resources. 

INFOSCOPE for DB2 actually 
provides a mechanism for 
measuring quality by assigning 
costs to access strategies and 
language elements. Element 
costs are accumulated, and 
INFOSCOPE for DB2 issues 
approvals, warnings, or outright 
rejections. These are based on 
total cost, relative to the rules 
you have established. 

Perhaps the most important 
feature of INFOSCOPE for DB2 
is its flexibility. You can adopt 
and supplement quality rules 
to fit your particular computing 
environment; then you may 
modify them as your applica¬ 


tion evolves and your environ¬ 
ment changes. 

Find out how you can write 
gleaming new DB2 applications 
and polish up your existing 
ones. Get full details and 
schedule a demonstration of 
INFOSCOPE for DB2 by con¬ 
tacting InfoTel today! 



InfoTel Corporation 

15438 N. Florida Avenue, Suite 204 
Tampa, Florida 33613 
1-800-543-1982 • 813-264-2090 
FAX: 813-960-5345 
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TransRELATE Workbench for DB2 


DBAs. 


How Many Tools Does It Take 
I To Manage DB2 ? 


A 

s a DBA, managing 
an IBM® DB2 system 
falls on your shoulders. 

Daily, you’re faced with a 
host of manual, time- 
consuming and error- 
prone procedures as you 
attempt to implement and 
maintain your DB2 system and 
support your DB2 
programmers. 

To manage your DB2 
environment, you may be , 
relying on a combination of the 
different DB2 tools and utilities 
currently available. However, 
this approach only adds another 
layer of complexity to an 
already complicated situation. 
You’re faced with incompatible 
tools for exchanging 
information; learning new 
command sets for each tool; 
and, finally, working with tools 
that are generally difficult to 


Now there is one 
product that provides 
the wide range of 
capabilities you need to 
meet all of your DB2 
administration 
requirements - 
TransRELATE Workbench 
for DB2 from Compuware 


One Source For All DBA 
Administrative Needs 

TransRELATE Workbench for 

DB2 consolidates these essential 
capabilities: 

• Comprehensive catalog query 
and analysis - without coding 
SQL 

• Extensive DB2 object creation 
and management 

• Extended object modification 

• Automated migration 

• Dropped object recovery 

• Automated DB2 utility generation 

• Simplified DB2 authorization 
management 

• Backup and recovery JCL 
generation 

• Automatic DCLGEN generation 

• Fully-functional DB2 command 

interface - enter DB2 commands 
from within the Workbench 

One Powerful 
Interface 


simultaneously displayed on a single 
screen, helping you quickly analyze 
and solve problems. 

Unique Maintenance 
Migration Capabilities 

Exclusive Workbench capabilities 
include “intelligent” maintenance 
migrations. TransRELATE 
Workbench for DB2 automatically 
identifies the impact of merging 
changes to existing DB2 subsystems 
- reporting all processing steps 
required-and automatically 
generates migration procedures. 

One Product , a World of DB2 
Benefits 

By simplifying the complex process 
of environment management, the 

TransRELATE Workbench for 
DB2 

• Increases DBA productivity 

• Reduces errors 

• Shortens time to production 

• Ensures reliable production 
support 


The Workbench features 
a window-driven user 
interface that gives you 
easy and rapid access to 
vital DB2 data. Multiple views 
of one or more catalogs can be 


IBM and DB2 are registered trademarks ot International Business Machines Corporation. Compuware and TransRELATE are 
registered trademarks of Compuware Corporation. 


Compuware is the leading 
provider of Computer Assisted 
Testing and Implementation 
(CAT1) tools, designed to help 
IS professionals efficiently test 
and maintain applications. 
More than 60% of IBM MVS 
and ESA mainframe 
installations worldwide use 
Compuware CATI products. 


For more information, or a 
no-cost 30-day evaluation, 
call 

1 - 800 - 535-8707 
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DBA SHOPTALK 


Domain classification: a scalar approach to consistent data names 


HE NEXT TIME YOU 
tell someone you're 
in the information 
management, commu¬ 
nications, or data pro¬ 
cessing business, don't forget to 
mention that you're also in the 
transportation business. This state¬ 
ment is less of an exaggeration 
than might you think: As a data 
administrator or database adminis¬ 
trator (DBA), your responsibilities 
are likely to include the movement 
and storage of critical company as¬ 
sets. And just as a shipping docu¬ 
ment must identify each item in a 
particular shipment, the data ad¬ 
ministrator or DBA must also be 
able to identify and name what¬ 
ever data is moved or stored. 

Today, most shipped mer¬ 
chandise is usually prelabeled by 
the manufacturer and assigned Uni¬ 
versal Product Codes (UPCs) to 
make identification relatively easy. 
Unfortunately, despite our best in¬ 
tentions data "shipments" don't 
bear such intrinsic, uniform, and 
consistent labeling—mainly since 
our data warehouses aren't exactly 
brimming with meaningful and 
consistently assigned data element 
names. Is it any wonder that im¬ 
posing rigorous data management 
principles is a lot like fighting 
Conan the Barbarian with only 
harsh words as weapons? The fact 
is, everyone from junior program¬ 
mers to senior analysts has jumped 
into the data-naming act. Critics of 
the resulting pluralistic paradigms 
suggest that a need for alternatives 
exists, but although they talk a 
good game, few will step forward 
to cast the first stone. And although 
CASE tools can help, they're use¬ 
ful only if suitable guidelines are 
also deployed. 

Clearly, if our primary objec¬ 
tive is accurate classification and 
consistent naming of data elements, 
it's imperative that we properly 


BY ROBERT TAKOUSHIAN 

The 

Naming 

Game 

identify the "goods," which, in 
our case, is data. As usual, getting 
to this objective can be half the 
fun, but because the first step is of¬ 
ten big, this column will suggest a 
method for cutting it down to size. 
First, we'll select a data-naming 
structure and assign the role of do¬ 
main classification to part of it. 
Then we'll affirm the importance 
of rule-based domain analysis, ob¬ 
serve its correlation to scalar mea¬ 
surement, and uniformly assign a 
class word to the data name based 
on that scalar/domain correlation. 

Most data processing practi¬ 
tioners agree that consistently 
named data elements can simplify 
data design and management. One 
evolving standard is the Informa¬ 
tion Resource Dictionary System 
(IRDS) data element-naming para¬ 
digm, which incorporates a three- 
part name consisting of one class 
word, n generic modifier(s), and 
one prime word. The order of the 
name components can be rear¬ 
ranged. Figure 1 shows a product 
distributor's possible data element 
names for markups. 

As structures go, the IRDS 
paradigm is sound and workable. 


However, while data element names 
may be structured uniformly, their 
component words are usually as¬ 
signed subjectively (sometimes ca¬ 
priciously) based on personal per¬ 
ceptions and predilections, rather 
than done objectively through in¬ 
formed analysis. The IRDS para¬ 
digm represents good form, but it 
still needs substance. 

Recall that a domain is the 
set of allowable (or valid) values 
for a particular data element; that 
is, every data element must have a 
domain defined over it, but a par¬ 
ticular domain value doesn't have 
to appear in the database to be val¬ 
id. Two tables can be joined if each 
contains a column that is defined 
over the same (atomic) domain or 
a subset thereof. However, as in 
our example, different domains 
usually embody different proper¬ 
ties, relationships, business rules, 
and so forth. Therefore, domain 
symbols may intersect even though 
their underlying values aren't de¬ 
fined over the same real domain, 
and no subset relationship exists. 

The phrase "the data is the 
business and the business is the 
data" may suggest the preeminence 
of domains: At a minimum, com¬ 
prehensive guidelines for logical 
data representation—often as com¬ 
plex as the business itself—should 
embody some semblance of domain 
sensitivity (or semantics), consis¬ 
tency, clarity, and concision. Adopt¬ 
ing the aforementioned generic 
modifier and prime word as a com¬ 
bined "domain word" can be use¬ 
ful in representing both enterprise- 
related entities and properties 


Class Word + Generic Modifier + Prime Word 


Domain 


PCT NY_RETAIL MKUP (5,10,15,20,25%) 

PCT LA_WHLSL MKUP (10,20,30,40%) 


FIGURE 1 . Data element names for markups. 
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NME ADR 
ID NDM 
TXT TYP 
FLO 


CDE 


CAT CLS 
SEQ 


TIM DTE 
TMS 


AMTSDM 
QTY TOT 


PCT AVG 


FIGURE 2. Sample class words based on scalar classification of domains. 


through which values meaningful 
to the enterprise are communicat¬ 
ed. How the enterprise chooses to 
communicate these values through 
the domain word is discretionary. 
Nevertheless, because data ele¬ 
ments are domain-based (read do¬ 
main-constrained), they must ad¬ 
here to rules of consistency and 
integrity. 

So far, we've selected the 
three-part IRDS naming paradigm 
and suggested that two of these 
parts can represent enterprise- 
identified domains. Now let's take 
a closer look at the rules-driven 
analysis process we should use to 
identify domains. 

It's probably not a coinci¬ 
dence that identifying business 
rules to achieve domain discipline 
closely resembles the popular defi¬ 
nition of measurement: A measure¬ 
ment is a rules-based assignment 
of symbols to the perceived de¬ 
grees (or "extensions") of a prop¬ 
erty. In both instances, empirical 
properties are learned, interpret¬ 
ed, and formulated into consistent 
and meaningful rules. More spe¬ 
cifically, domain analysis is a pro¬ 
cess that includes learning the busi¬ 
ness rules that govern the rela¬ 
tionships between the domain's 
values. In short, domain analysis 
entails learning the domain's pro¬ 
perties. Metrologists have named 
the mandatory structural similar¬ 
ity between a property and its un¬ 
derlying symbolic representation 


"isomorphism." Scalar measure¬ 
ments derive their meaning from 
the fact that they can represent or 
behave like observed phenomena. 

Once we agree upon and as¬ 
sign the prime word and generic 
modifiers to a data element name, 
the only task that remains is to se¬ 
lect the proper class word. If a 
class word represents a domain clas¬ 
sification, how do we classify do¬ 
mains? Despite domain analysis fol¬ 
lowed by accurate "domain word" 
assignment, consistent domain 


classification (also known as data 
taxonomy) can remain elusive. It's 
one thing to identify a domain; it's 
quite another to classify one. 

At a presentation given by 
Judith Newton of the National In¬ 
stitute of Standards and Technol¬ 
ogy, I was pleasantly surprised to 
receive a chart entitled "Taxon¬ 
omy for Formation of Data Enti¬ 
ties." However, I noted, and was 
greatly saddened by, the absence 
of an actual framework for assign¬ 
ing class words. Furthermore, I was 
informed that the ANSI classing 
committee (X3L8) hadn't decided 
on a method for data classification. 

If scales can measure real pro¬ 
perties, why can't they measure the 
representation of some property or 
some attribute's assumable values— 
namely, its domain? We already 
know that when we examine a do¬ 
main we also need to determine 
the relationships among values. 
All that remains is to choose the 
appropriate "level" of measure¬ 
ment (as scale types are sometimes 
called) and voil'a —we have a sys¬ 
tematic method for forging consis¬ 
tent and meaningful data element 
class words from the principles of 
scalar measurement. Domain clas¬ 
sification notwithstanding, scalar 
mapping can also be a useful pre¬ 
cursor to domain-support analysis, 
which consists of identifying valid 
comparison operators, doing integ- 


Nominal A list of names, numbers, IDs, text, or properties that 

are neither measurements nor orderable; “nominal” refers to 
“name.” Examples: marital status, social security numbers, 
ZIP codes, phone numbers, color. 

Ordinal Any convenient relative property measurement or comparison 
scheme; permits relative ordering of a property, but the order¬ 
ing lacks a natural origin. Examples: street numbers, mineral 
hardness (1 =lead . . . 5 = tin . . . 10 = steel). 

Rank Same as ordinal scales, but the properties must also have 

natural boundaries. Examples: military ranks, automobilesizes 
(subcompact, compact, mid-size, full-size). 

Interval Measurement relative to an arbitrary origin. Examples: 
calendar dates, military time, number line. 

Absolute Measurement of discrete variable; a count or enumeration 
of like units. Example: head count. 

Ratio Measurement that has an origin (usually zero), an ordering, 

and a set of operations; all measurements are expressed as a 
ratio of an arbitrary unit. Examples: length, weight, volume, 
mileage, duration. 


FIGURE 3. Descriptions of commonly used scales. 
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rity checking, and so forth. 

Naturally, the two assump¬ 
tions here are that class word us¬ 
age is indeed domain-based and 
that any domain can be mapped to 
a commonly accepted scale type. 
Furthermore, a nontrivial caveat 
exists regarding the capacity of a 
database to accommodate changes— 
especially business rule changes— 
with minimal structural disrup¬ 
tions. The database design process 
includes a thorough analysis of all 
domains, which are sometimes re¬ 
ferred to as both the "fabric" of 
the database as well as the "glue" 
that binds its structure. Doing any 
part of this design in a vacuum 
weakens it and becomes a formula 
for database instability. In particu¬ 
lar, if an unforeseen business rule 
change requires domain redefini¬ 
tion, a new scale may apply. 

Figure 2 shows three primary 
data classifications that distinguish 
among qualitative, quantitative, and 
conditional data elements. Under 
each primary class are the applica¬ 
ble scale types and the valid class 
words that they can assume. See 
Figure 3 for a description of each 


scale type. Note that other scales 
exist, such as probability, binary, 
and so forth, but many of them 
may be deemed as variations of 
these more primitive scales. The 
class word "IND" (conditional) is 
for a data element that describes 
the physical record in which it ap¬ 
pears; that is, a record indicator. 
Typically, this kind of data ele¬ 
ment is used solely to control ap¬ 
plication processing rather than to 
describe a real property or attri¬ 
bute of an entity occurrence. 

Correlating class words to 
scales should be as intuitive as 
possible. Class words for industry- 
specific data can also be used (for 
example, "WGT" [weight] is used 
in the shipping industry and 
"DUR" [duration] is used in broad¬ 
casting). When considering what 
the underlying values in the data 
element's domain represent and 
how they're used, it's important 
not to confuse domain order with 
symbol orders. For example, the 
domain order of all integers be¬ 
tween zero and nine inclusive is 
(0,1,2,3,4,5,6,7,8,9), whereas the 
symbol order (alphabetic) is (eight. 


five, four, nine, one, seven, six, 
three, two, zero). 

Keep in mind that measure¬ 
ment levels are cumulative with 
respect to the properties they can 
measure. That is, a higher measure¬ 
ment level such as an interval scale 
is more inclusive than a lower lev¬ 
el such as ordinal. An interval scale 
can be adapted to measure an ob¬ 
ject or relationship exhibiting only 
ordinality, but an ordinal scale can¬ 
not be adapted to perform the same 
function as an interval scale. 

Consistency is the basis of 
any useful taxonomy. Unfortunate¬ 
ly, a useful data taxonomy is the 
exception rather than the rule. 
Nevertheless, it's clear that a thor¬ 
ough data analysis (including do¬ 
main) is a prerequisite for assign¬ 
ing data names consistently and 
objectively. As we've seen, apply¬ 
ing key scalar measurements to do¬ 
main analysis can provide a frame¬ 
work for systematically translating 
data element metadata into consis¬ 
tent and meaningful data names. 1111 

Robert Takoushian is the data adminis¬ 
trator for a major communications firm. 


Bridging Your CASE Tools to DBMS/DDS 



You and your colleagues have been working very hard in the past few months to input the user requirements into 
your CASE tool only to find out that it does uol talk to the file system, DBMS, or Data Dictionary Systems you already 
have. You may or may not have been involved in the original CASE tool selection, but it is too late to blame anybody 
now. Fortunately, CHEN has the solution! 

Since CHEN Workbench can interface with more than 30 (Yes, 30!) commercial DBMSs and Data Dictionary Systems 
and quite a few CASE tools, it is very likely that CHEN already has it made for you. If not, we can develop it quickly to 
save your project from a disaster. So, if you are using IEW (or Excelerator,...) and want to convert the design data to 
Sybase (or Progress, Informix, DEC CDD+, CA IDD,...), give us a call. And you will be happy you did. 

Also, if you want to convert from one file/DBMS to another or from one CASE tool to another, talk to us. 

CASE-TO-CASE DBMS-TO-DBMS 



Phpn & Accnriatoc 

4884 Constitution Ave., Suite 1-E, Baton Rouge, LA 70808 
(504) 928-5765 Fax (504) 928-9371 
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SQL Productivity Tools 
Every Step of the Way. 


STEP1 


Design & Analysis 


STEP 4 


Operational Control 



Application Development 


Testing & Tuning 


SQL* Advantage 

Gateways 

Performance Accelerator 

SQL* Debug 

4GL Environments 

SQL* Batch 

SQR Family 


TOP* Converter 


SQL Productivity. Two words often promised but, until 
now, never delivered by a single vendor. That was before SQL 
Solutions stepped in with the SQL Productivity Environment, 
the first suite of multi-RDBMS SQL productivity tools for the 
entire SQL Application Lifecycle. 

Armed with these tools, SQL Solutions consultants have 
designed, developed and implemented hundreds of applica¬ 
tions for every major RDBMS, including SYBASE™, ORACLE®, 
Rdb™, INFORMIX™, INGRES™, and DB2™ 

SQL Productivity Environment features: 

■ Deft, the premier RDBMS CASE Solution for Macintosh, 
VMS and Unix. 

■ SQL 4 Advantage, a programmer productivity environ¬ 
ment for SQL. 

■ SQL 4 Debug, an interactive, source-level debugger for 
SQL. 

■ 4GL Environments, high-performance 4GL application 
development tools. 

■ SQR Family, a family of 4GL report writing tools. 

■ Gateways, gateways services opening up RDBMS worlds 
(SYBASE, ORACLE, Rdb, INFORMIX, INGRES, RMS). 


■ TOP* Converter, a tool that automatically 
converts SQL*Forms 2.3 to 3.0 triggers. 

■ Performance Accelerator, an optimizer for ORACLE 
SQL*Forms. 

■ SQL 4 Batch, a tool adding multi-tasking to RDBMSs 
under VAX VMS. 

■ DBA Companion Environment, operational control 
tools for DBAs. 

■ SA Companion, tools for System Administrators. 

SQL Productivity Environment is the most comprehensive 

suite of tools in the industry today. And backed by our SQL 
Systems Integration services, we're with you every step of the 
way. Call us. Today. 



SQL Solutions, Inc. 

The SQL Systems Integrators 
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8 New England Executive Park, Burlington, MA 01803, 
617-270-4150 or 1-800-933-0044 


SYBASE and Deft are trademarks of Sybase Corporation, ORACLE and SQL*FORMS are registered trademarks of Oracle Corporation, Rdb, RMS and VMS are trademarks of Digital Equipmen 
Corporation, INFORMIX is a registered trademark of Informix Software, Inc., INGRES is a registered trademark of Ask Computer Systems, Inc., DB2 is a registered trademark of IBM Corpora! 
Macintosh is a registered trademark of Apple Computer, Inc., TOP*Converter is a trademark of Comtecno B.V. 













Cursors versus queries, and COBOL versus SQL 


"Even if you invent a story , you 
stick by it. We're journalists here!" 

—Editor at Weekly World News 
to reporter Mark Kramer 

"Even if you screw up a data¬ 
base , you stick by it. We're program¬ 
mers here!" 

—Joe Celko 


FRIEND OF MINE AT 
Nissan Motor Accep¬ 
tance Corp. (NMAC) 
recently asked me to 
help with a database 
design problem. NMAC's loan ap¬ 
plications are run against one or 
more credit bureaus—which credit 
bureau it uses depends on the ZIP 
code of the dealership or appli¬ 
cant. Obviously, it's better for peo¬ 
ple in New York to use a New 
York credit bureau rather than a 
California bureau. Each credit bu¬ 
reau has a two-letter abbreviation 
code and its field is a concatenated 
string of these two-letter codes, up 
to five codes in length. 

NMAC's proposed method for 
handling this problem was to write 
a procedure in a host language that 
would accept the string as a pa¬ 
rameter, parse it, and do all the 
credit checking work via leased 
telephone lines. The query for this 
method is simple: 

SELECT * 

FROM CreditBureaus 
WHERE (zipcode = JhisZip;) 

The question is, is this table 
properly normalized? The imme¬ 
diate response is no, because the 
string hides a one-dimensional ar¬ 
ray of two-letter codes. Arrays vio¬ 
late the flat file condition for first- 
normal form. Instead, the schema 


BY JOE CELKO 

Queries 

and 

COBOL 

should have a table with the col¬ 
umns ZIP code and credit bureau 
code. 

Upon further investigation, I 
found that the table of ZIP codes 
and credit bureau codes is keyed 
on the ZIP code, so it always pro¬ 
duces the same vector for each ZIP 
code. Each credit bureau's infor¬ 
mation appears once in its ZIP 
code. 

Encoding schemes are a ma¬ 
nia of mine, as loyal readers will 
remember from an earlier col¬ 
umn ("Make or Break Your Sys¬ 
tem," DBA Shoptalk, March 
1989). The credit bureau uses a 
concatenation scheme. However, 
two kinds of concatenation encod¬ 
ing schemes exist: ordered and 
unordered. In an ordered scheme, 
"AB" isn't the same as "BA," but 
they are the same in an unordered 
scheme. 

The next question is, what 
does this order mean, if anything? 
I discovered that credit bureaus are 
ordered in the string from least ex¬ 
pensive to most expensive to use. 
This order minimizes credit report 
costs. Therefore, if you normalized 
the data, you would need a table 
with the columns ZIP code, credit 
bureau code, and report cost. The 
query that will give you the least 
expensive bureau for a given ZIP 
code is easy. Just remember to re¬ 


strict information to the target ZIP 
code as follows: 

SELECT bureau#, cost 
FROM CreditBureaus 
WHERE (zipcode = JhisZip) 

AND cost = (SELECT MIN(cost) 

FROM CreditBureaus 
WHERE (zipcode = JhisZip)); 

However, this ideal situation 
isn't always the remedy. If you can't 
find information on an applicant 
at the first bureau, you'll then have 
to try the rest of the credit bureaus 
in order of increasing cost. Find¬ 
ing the nth cheapest bureau is a lit¬ 
tle harder, however. Assume that 
no two credit bureaus in the same 
ZIP code charge the same rates. If 
you ask for the (n + 1) bureau in an 
area with only n bureaus, then 
you'll expect to get back an empty 
result table. The host language plac¬ 
ing the credit inquiry can use this 
empty table as a flag to stop. 

You can also perform this op¬ 
eration with a cursor ordered by 
ascending cost. Use the cursor to 
load a five-element array and loop 
through the elements. However, 
keep in mind that cursors are 
frowned upon by pure SQL fanat¬ 
ics because they use a host lan¬ 
guage to perform. 

HE BEST PURE SQL 
solution I've found is 
a self-join with a 
GROUP BY clause. The 
idea is to take each 
cost within a given ZIP code and 
build a group of costs that are less 
than or equal to it. The cheapest 
cost is in a group with one row, 
the second cheapest is in a group 
of two rows, and so forth. The nth 
cost will be the largest value in its 
group, so you can get a single-row 
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answer by pulling out the maxi¬ 
mum cost for each group. The que¬ 
ry can be changed by using a new 
value for n in the HAVING clause and 
a new ZIP code in the WHERE clause. 
The query follows: 

SELECT A.bureau#, MAX(cost) 

FROM CreditBureaus A, CreditBureaus B 
WHERE (A.cost > = B.cost) 

AND (A.zipcode = B.zipcode) 

AND (A.zipcode = :ThisZip) 

GROUP BY A.bureau# 

HAVING C0UNTH = (:n); 

Notice that this query is exact: If 
two credit bureaus in the same ZIP 
code charge the same price, the re¬ 
sults are empty. Using the BETWEEN 
predicate or modifying the HAVING 
clause to use "< = " can pull out 
any contiguous sublist. 

Now, I should be honest: The 
cursor is probably faster and easier 
to use than my snappy query. SQL 
programmers are like a program¬ 
ming language (APL) program¬ 
mers; they can write anything in 
one statement! And, just like APL, 
when they're finished, other pro¬ 
grammers can't read it. The real 
problem with tricky queries isn't 
that they're hard to read later; 
rather, they can take days to run 
because they often have hidden 
Cartesian products and self-joins, 
operations on nonindexed columns, 
and such other CPU cycle-eating 
monsters. 

R ed brick systems 

of Los Gatos, Califor¬ 
nia was founded by 
Ralph Kimball, who 
also founded Meta¬ 
phor Computer Systems. He is both 
Red Brick's president and techni¬ 
cal expert. The company's product 
is a software SQL query optimizer 
called Gold Mine, which promises 
at least a 10 to one improvement 
in query speed. Therefore, this 
product should be able to help me 
with my killer queries. 

Three versions of Gold Mine 
are available now. The stand-alone 
version is a small but complete 
RDBMS system in itself. The at¬ 
tached version works in tandem 
with an existing RDBMS server, 
while the integrated version places 
Gold Mine technology inside an 
RDBMS server. 

Gold Mine is aimed at data- 


A single SQL 
statement can 
literally equal 
several hundred 
lines of COBOL 

bases over 10 gigabytes, so don't 
expect to see it on your laptop. Be¬ 
cause it tries to avoid building 
scratch files, the required storage 
space is based on the size of the 
answer, not the complexity of the 
query. As a rough estimate, index 
generation is 15 percent of the 
original flat file size. 

Because you have to do a lot 
of work to get the information that 
Gold Mine needs from the data¬ 
base, the company recommends 
schema tuning for best results. 
The product's weakness is that all 
join paths must be known at sche¬ 
ma creation. As a result, only equi- 
joins can be used on foreign keys. 
The product works best with a star 
join; that is, one central table with 
columns that act as foreign keys to 
other tables. In the real world, 
these restrictions aren't bad for 
performance improvement. 

The other part of the product 
is Reduced Instruction SQL (RISQL), 
which is an extended version of 
SQL-89. Any standard, vanilla SQL 
query can run in RISQL. The ex¬ 
tensions are for aggregate func¬ 
tions that depend on the ORDER BY 
clause to let you do traditional 
break-and-summarize reports. 

Overall, I don't like RISQL 
extensions within the SELECT clause 
because they're dependent upon an 
ordered result table that violates 
the set orientation of SQL. Many 
of the extensions are very complex 
and hard to figure out. Some of 
them will also be available in 
SQL2, but with different syntax. 
Red Brick Systems should have 
looked ahead and checked the 
forthcoming standards. 

I can write all of the prod¬ 
uct's new aggregate functions in 
standard SQL. Although writing 
the functions requires horrible, 
complex CREATE VIEW and SELECT state¬ 
ments that run for days, it's possi¬ 
ble. Remember my earlier remarks 
in this column about SQL program¬ 


mers who can write anything in 
one SELECT statement? 

I would rather have a straight 
report writer that uses a standard 
SQL query to access data. This func¬ 
tion will let you save the query 
while you change the formatting 
and summary functions for differ¬ 
ent reports. Therefore, I recom¬ 
mend that you look for the Gold 
Mine engine, but ignore the RISQL 
extensions. 

A S MUCH AS I HATE 
to admit it, most of 
the commercial indus¬ 
try is still writing pro¬ 
grams in COBOL. 
However, a single SQL statement 
can literally equal several hundred 
lines of COBOL. Furthermore, the 
single SQL statement can run fas¬ 
ter and be easier to read than an 
entire COBOL program. For ex¬ 
ample, try to write a COBOL pro¬ 
gram to perform the following 
query: 

SELECT dept#, MAX(salary), MIN(salary), AVG(salary), 
FROM Employees 
WHERE (sex = “F”) 

GROUP BY dept# 

HAVING (COUNT(dept#) > 1 
AND AVG (salary) < (SELECT AVG(salary) FROM 
Employees)); 

When you try to format a readable 
report with this query, you'll see 
some of the advantages of COBOL 
as a host language for SQL. 

Micro Focus Inc. in Palo Alto, 
California provides a PC version 
of COBOL that works with XDB 
Systems Inc., which is based in 
Laural, Maryland. XDB's claim to 
fame is that its product can look 
like ANSI or DB2 versions of SQL. 
A developer can write a program 
on the mainframe in IBM COBOL 
with DB2 and move it to Micro Fo¬ 
cus COBOL and XDB on the PC 
and vice versa. 

This dynamic duo is working 
quite well. In fact, I watched this 
combination produce faster turna¬ 
round and generally better code at 
NMAC. The PC environment is 
more developer-friendly than main¬ 
frames, and parallel efforts are al¬ 
ways faster. George Schussel of Di¬ 
gital Consulting Inc. in Andover, 
Massachusetts reported that one of 
his clients was able to transport a 
130,000-line COBOL program from 
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the mainframe to the PC by chang¬ 
ing only 12 lines of code—major 
league downsizing! I've been told 
that the UNIX version of Micro 
Focus COBOL works as well and 
allows easy movement from DOS 
to UNIX, but I have no experience 
with this product. 

A CUCOBOL, WHICH 
also has a good rep¬ 
utation for portabil¬ 
ity and downsizing, 
has taken an approach 
other than providing embedded 
syntax. It has a general file access 
mechanism in its COBOL that looks 
like standard COBOL I/O to the 
programmer but, underneath this 
level, it hooks into specialized in¬ 
terfaces for different file systems. 
Its new Acu4GL product is an ex¬ 
tension of this idea. It provides a 
link to SQL databases while hid¬ 
ing the details from the COBOL 
programmer. So far, it has Oracle 
and Informix links, with others in 
the works. (As an aside, I hate the 
product's name because it sounds 
like a 4GL product, rather than a 
database access tool. Instead, may¬ 
be the company should try "Acu- 
Access," "AcuDBlink," or some¬ 
thing similar.) 

Compiler directives in Acu- 
4GL's COBOL file-definition code 
map structured records and SQL 
flat files into each other and han¬ 
dle datatype conversions, such as 
the conversion of numeric fields to 
date columns and renaming fields. 
This function means that group 
names and REDEFINEs aren't allowed 
in the mappings. Acu4GL's two 
goals are to have no data duplica¬ 
tion and let programmers write sim¬ 
ple, straightforward COBOL with¬ 
out having to know anything about 
SQL. 

Micro Focus COBOL is a true 
compiler, generating executable 
modules, while Acucobol is inter¬ 
preted. This function explains how 
the company is able to change the 
I/O at run time and claim faster 
compile time than Micro Focus. The 
trouble is, you pay for this faster 
compile time in execution speed 
and code size for every run after 
the first one. 

Ignoring the COBOL compil¬ 
er differences. I'm a little hesitant 
about Acucobol because its SQL 
interface doesn't adhere to the 


ANSI-defined host language inter¬ 
faces. This inconsistency could be 
a handicap for porting mainframe 
applications to the PC and vice 
versa. Another weak spot is the 
product's lack of an indicator vari¬ 
able similar to those in the ANSI- 
standard embeddings. An indica¬ 
tor variable is an integer value that 
returns a flag when errors or NULLs 
in its associated variable exist. In¬ 
stead, Acu4GL's NULLs are mapped 
to zeros or blanks. 

However, the product offers 
a fast way to downsize existing. 


older COBOL applications to a PC 
with an SQL database. Once you 
have mapped the SQL into CO¬ 
BOL, all of your code will port 
without changes. 

If any of you have worked 
with these products, please let me 
know what you think of them in 
care of Database Programming & De¬ 
sign. I'll offer a new book as a prize 
for the best case study. 1111 

Joe Celko is a database-design consul¬ 
tant in Los Angeles. He is currently work¬ 
ing as a consultant for NetBase in Tor¬ 
rance, California. 


CDB Software 

Announces the 

APPLICATION ENABLING SERIES for DB2 


CDB/REXX 

full-function interface between REXX and 
DB2; the simplicity and power of REXX 
with the superior data handling capabilities 
of DB2 

CDB/EDIT 

“Back-to-Basics” DB2 Table Editing; 
efficient and intuitive 

CDB/EASYSQL 

SQL composer for DB2 programmers; step- 
by-step statement building and help 
facilities for the DB2 novice, time savers for 
the experienced programmer 

m . 

CDB Software, Inc. 

6464 Savoy 
Suite 120, Dept. D 
Houston, TX 77036 
713-780-2382 
FAX 713-784-1842 

And sometimes Why: Call 800-627-6561, our new 24 hour support line. 
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AE I • O • U: Send or FAX for information on 
the new AE Series of DB2 Tools. 
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Finally a server 
tot has no problem 
relating to others. 


New Microsoft* SQL Server 4.2 
is open to any kind of data. 

In virtually any format. From any 
source, anywhere in your enterprise. 

That makes it the only database 
server that can integrate all your infor¬ 
mation systems. Which, of course, helps 
your users get more infor¬ 
mation. And your company 
get more competitive. 

SQL Server reaches 
out and talks to your minis 
and mainframes. On plat¬ 
forms as diverse as UNIX" 

MVS, and VMS: With prod¬ 
ucts as diverse as Oracle* 

SYBASE," DB2; and Rdb. 

It’s equally compatible with net¬ 
work standards like LAN Manager, 
NetWare, VINESJand LAN Server. 

It can even handle non-relational 
data sources like VSAM files, CICS, 
and satellite information feeds - an abil¬ 
ity beyond the reach of other servers. 

All of which has an important re¬ 
sult. As you add client-server technology, 
you don’t sacrifice your existing systems. 
You simply improve them. 

Dramatically 

Even better, SQL Server 4.2 inte¬ 


grates seamlessly with the Windows 
operating system. So data isn’t just eas¬ 
ier to get. It’s easier to use. 

Developers, for instance, can now 
choose from over 120 tools - and use 
scrollable cursors that make creating 
Windows applications easier than ever. 

And SQL Server has new, 
integrated Windows admin¬ 
istration. So you can man¬ 
age multiple servers from 
a single desktop. 

Controlling data is 
easier, too. Our program¬ 
mable server architecture 
lets you enforce data integ¬ 
rity application logic, and 
even business rules. Right at the server. 

The net effect? You can finally 
integrate corporate data, manage it in¬ 
telligently, and deliver it to your users. 

Quickly safely and accurately 

So call (800) 922-3675, Dept. Y51, 
to order a customer solutions video, or 
to get a customized information pack 
on the new SQL Server 4.2. 

We think you’ll really relate to it. 

Microsoft 
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database server. It takes all kinds. 
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BY RICHARD FINKELSTEIN 


Client/server architecture lets MIS shops pursue the "best of breed' 
Here's a guide to help you make sure the best fits with your breed 



Client 

for 

Youp Server 


NE OF THE MAJOR 
benefits of the client/ 
server architecture is that it lets 
developers and users choose tools 
that best suit their needs. This 
"mix-and-match" approach is pos¬ 
sible because application programs 
run on the "front end": powerful 
desktop computers independent of 
the "back-end" database server. 
Front-end development tools and 
applications communicate with back¬ 
end database servers through ap¬ 
plication programming interfaces 
(APIs), which are provided by 
DBMS vendors. 

APIs are used by client appli¬ 
cations to call library routines that 
transparently route SQL command 
requests from the front-end client 
application to the database server. 
The routines control all access to 
the database server and manage 
network access so that the applica¬ 
tions developer need not be con¬ 
cerned with low-level network 


protocols. Once the request is ex¬ 
ecuted by the database server (via 
SQL commands such as SELECT, UP¬ 
DATE, INSERT, and DELETE), the database 
server sends data records or mes¬ 
sages back to the client. 

This scheme makes applica¬ 
tion logic independent from server 
operating systems (such as OS/2, 
UNIX, VMS, MVS, and LAN Man¬ 
ager) and network operating sys¬ 
tems and protocols (such as LAN 
Manager, Netware, TCP/IP, and 
Pathworks). Front-end vendors and 
developers can use this architec¬ 
ture to build tools that access a va¬ 
riety of database server platforms. 

FORMS DEVELOPMENT TOOLS 

Database server vendors usually 
offer their own set of forms devel¬ 
opment tools tailored for use with 
the vendor's DBMS. Examples of 
these would be Oracle SQL*Forms, 
Ingres 4GL and Windows/4GL, 
Sybase APT-Workbench, and In¬ 


formix 4GL. Most of these devel¬ 
opment tools are designed to work 
solely with the vendor's DBMS. 
The exception is Ingres's 4GL and 
4GL/Windows tools, which are ca¬ 
pable of working with Tandem 
Computers' Non-Stop SQL and Di¬ 
gital Equipment Corp's Rdb/VMS 
as well as the Ingres RDBMS. 

All of these tools have good 
functionality, but often users will 
have requirements that make third- 
party products a better choice. Some 
of these requirements may include: 

□ More functionality 

□ Better performance 

□ Support for a particular cli¬ 
ent platform (for example, Windows 
or OS/2 PM) 

□ Support for a greater vari¬ 
ety of database servers 

□ Better portability among 
clients and platforms 

□ Better support for team 
development (for example, shared 
libraries) 
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□ Superior debugging aids. 

Therefore, dozens of third- 
party forms tools are currently 
marketed for use with the major 
SQL database servers. The Micro¬ 
soft/Sybase product, SQL Server, 
alone has over 100 front-end prod¬ 
ucts from which to choose. Once 
you've decided to evaluate third- 
party tools, it's a good idea to set 
up some criteria for quickly elimi¬ 
nating potential products and de¬ 
veloping a short list of contenders. 
Normally, the database server will 
be selected first, thereby eliminat¬ 
ing front-ends that do not have 
links to the target server. The next 
step may be to decide on the client 
operating system platform. 

CHARACTER-BASED VS. GUI 

Client applications can either be 
character-based or graphical user 
interfaces (GUIs). DOS and UNIX 
workstations are popular charac¬ 
ter-based operating systems while 
Windows, OS/2 Presentation Man¬ 
ager (PM), Motif, or X-Windows rep¬ 
resent possible GUI environments. 

Character-based systems have 
been around for a long time and 
offer mature and proven capabili¬ 
ties. Examples of popular client/ 
server development tools for UNIX 
and VAX database servers are Uni¬ 
face Corp.'s Uniface, JYACC Inc.'s 
JAM, Unify Corp.'s Accell, and Cog- 
nos's Powerhouse. Borland Interna¬ 
tional's dBASE IV and Paradox, Da¬ 
taBase International's DataEase, and 
Revelation Technologies' Revelation 
are examples of commonly used 
DOS front-end development tools. 

Client/server computing is 
new, so look for products that 
have been used in product client/ 
server environments. Just because 
a product works well in multiuser 
or PC network environment does 
not mean it will perform well in a 
client/server application. Many 
character-based systems were not 
designed for client/server and do 
not link well with the back-end 
server or take advantage of data¬ 
base server capabilities. The best 
way to judge a product's worthi¬ 
ness is to test it out and find refer¬ 
ences who have set up applica¬ 
tions similar to yours. 

GUI environments are new 
and do not have the user base or 
maturity of character-based prod¬ 
ucts. Most of the recently released 


Normally, the 
database server 
will be selected 
first 

GUI front-end products have been 
designed specifically for client/ 
server computing, making better 
use of desktop power through GUI 
interfaces, and also provide better 
support for database server ad¬ 
vanced capabilities such as com¬ 
plex data types (for example, graph¬ 
ics, images, documents, and so on). 
Some of the choices for use under 
Microsoft's Windows 3.0 are Power¬ 
Builder, ObjectView, Omnis 5, and 
SQLWindows. Application Man¬ 
ager and Object/I are two prod¬ 
ucts that work under IBM's OS/2 
Presentation Manager. Ingres Win¬ 
dows /4GL is an example of a 
product that can run on a number 
of GUI platforms, including UNIX 
Motif (for UNIX workstations) and 
Windows 3.0. 

Many people have had good 
experiences with GUI products, 
but most have found that the 
products entail a long learning 
curve. Programmers have to learn 
a new method of programming for 
a more complex environment. Pro¬ 
grammers must consider many new 
elements in the GUI form design, 
such as window placement, inter¬ 
action with GUI objects (radio but¬ 
tons, push buttons, pop-up mes¬ 
sages, and so forth), color schemes, 
and fonts. In a GUI environment, 
simple character-based processes, 
such as choosing the length of an 
input field, are time-consuming 
because proportional fonts make 
maximum field widths variable 
and difficult to calculate. 

GUI environments (for ex¬ 
ample, Windows, PM, and Motif) 
themselves are new, which adds to 
the potential problems. Many us¬ 
ers have experienced unpredictable 
program failures and networking 
problems when using Windows 
and other GUI environments. GUIs 
also require more power than char¬ 
acter-based systems; therefore, they 
should be deployed on machines 
with fast processors (preferably a 
386 or more) and a good amount of 
memory (4MB or more). Character- 


based tools are more appropriate for 
smaller machines. 

Making a choice between 
character-based and GUI-based tools 
is difficult. Character-based envi¬ 
ronments are proven, but they do 
not offer the long-term flexibility 
and power of GUIs. GUIs are newer, 
have less of a track record, and 
have long learning curves—but also 
have much greater strategic poten¬ 
tial. In evaluating GUIs for your 
application, consider the following: 

□ The nature of the applica¬ 
tion. Does it require a GUI? 

□ Complexity. Is the applica¬ 
tion small enough to reduce the 
risks involved with using a new 
GUI environment? 

□ Other types of applications 
with which it will interact. Is 
multitasking and data exchange 
necessary? 

□ The long-term direction of 
the company with regard to GUIs. 

□ Existing hardware and net¬ 
work limitations. 

If you need an application up 
and running quickly with the least 
amount of risk, you will probably 
lean toward a character-based forms 
tool. GUIs should be your choice if 
the application's requirements can 
be more easily satisfied in a GUI 
environment or if the long-term 
direction of your organization is 
moving toward GUI applications. 

FORMS DEVELOPMENT 

One of the major categories of 
front-end tools are those for forms 
application development. This soft¬ 
ware is used to generate the basic 
screens for maintaining a database. 
The rule of thumb is that if the 
DBMS vendor has a forms devel¬ 
opment tool, it will probably be 
the best fit for the back-end data¬ 
base server engine. For example, if 
the database server supports bina¬ 
ry (or basic) large object (BLOB) 
fields—which can contain up to 
two gigabytes of data—the DBMS 
vendor's front-end tool will sup¬ 
port access to this field. The DBMS 
vendor's front-end tool will prob¬ 
ably support such specific database 
server features as backward and 
forward scrolling, special locking 
and concurrency controls, or stored 
procedures (partial transactions 
stored in the database server). 

A third-party vendor may 
also support these back-end data- 
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openness cannot be added on or retrofitted, it must be there 
from the start. 

We created UNIFACE 4GL to be open by design, from 
day one. Based on the ANSI ISO 3-schema architecture and 
industry standards, applications developed with UNIFACE are 
completely portable across DBMS/FMS, network, presentation 
interface, platform and CASE tool. 

With UNIFACE 4GL, you can migrate applications 
from stand-alone to client-server, from character mode to 
OSF/Motif, Open Look, MS-Windows or 
OS/2 Presentation Manager, without rewriting 
a single line of code. You can also develop on 
SYBASE, ORACLE, INGRES, Informix, Rdb 
and a host of other DBMS/FMS across MS-DOS, 

OS/2, ULTRIX, Unix, VMS or VOS, and 
deploy on any or all of them without recoding. 
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you have a complete and open development environment. 

We don't have to follow the "Open Systems" piper, we 
wrote the tune. Because we invented the "Open 4GL By 
Design" concept. More importantly, we delivered. And we have 
thousands of customers around the world who are glad we did. 

Call for a copy of "Guide to Building Futureproof 4GL 
Applications", or sign up for a free application develop¬ 
ment seminar. Find out why UNIFACE is the only 4GL 
that is not asking you to pay the piper. 1-800-365-3608 
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thousands of customers around the world who are glad we did. 
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base server features, but this sup¬ 
port must satisfy three criteria: 

□ The special feature is ac¬ 
cessible through the database serv¬ 
er's API. 

□ The front-end tool is capa¬ 
ble of using the feature (many tools 
developed before client/server 
cannot accommodate such features 
as BLOBs). 

□ The front-end vendor has 
made the resource and time effort 
to design the feature into the 
front-end product so it has practi¬ 
cal usefulness; for example, the fea¬ 
ture is fully supported, stable (does 
not crash), and performs well. 

A few specific areas should 
be evaluated for efficiency and 
correctness of implementation: 

□ Locking and Concurrency 
Controls. Each database server has 
its own unique brand of locking 
and concurrency control mecha¬ 
nisms. For example, the Microsoft/ 
Sybase SQL Server supports repeat- 
able read and program controlled 
optimistic locking; that is, the front- 
end tool must check whether other 
transactions have updated the same 
records. However, like DB2, it sup¬ 
ports an update through a cursor. 
Oracle defaults to read-consistency 
mode, which does not lock and rec¬ 
ord, so the front-end tool must do 
its own checking before updating 
any records. Gupta Technologies' 
SQLBase supports locking similar 
to DB2's, as well as a special type 
of read-only locking, while No¬ 
vell's Netware SQL supports only 
table-level locks when the transac¬ 
tion is under Netware's transac¬ 
tion management (that is, COMMIT/ 
ROLLBACK logic is being used). 

The front-end tool must ad¬ 
just itself for all of these possibili¬ 
ties, depending upon which data¬ 
base server is being accessed. 
Some front-ends support only one 
locking mode, while other front- 
ends support all of those available 
for the back-end servers. Often, a 
front-end will work particularly 
well with one of the database serv¬ 
ers (the one to which the front- 
end vendor is most committed), 
while it will have less than optimal 
support for other database servers. 

Some front-ends make locking 
transparent to the programmer, and 
others require the programmer to 
control locking directly. Program¬ 
mer-controlled locking can lead to 


The front-end 
tool must adjust 
to all of these 
possibilities 

inconsistency of application imple¬ 
mentations and could possibly lead 
to corrupted data. User-controlled 
locking limits the product's useful¬ 
ness for users who are not trained 
in managing database locking. The 
best front-end tools mask the con¬ 
currency control completely from 
users. 

□ Messages and Codes. Front- 
end tools may replace database 
server-specific messages and codes 
with their own messages and codes. 
This feature means that you can mi¬ 
grate applications from one data¬ 
base server to another more easily. 
It is also important to have access 
to the database server-specific codes 
for debugging purposes. 

□ Handling SQL and Exten¬ 
sions. Each front-end tool must 
make decisions on the type of SQL 
commands it will generate to ac¬ 
cess the database server. For ex¬ 
ample, in some cases it may be bet¬ 
ter, for performance reasons, to 
issue an SQL SELECT command with 
a join and, in other cases, it may be 
better to avoid join logic and issue 
separate SQL SELECT commands for 
each table being accessed. Join 
operations themselves sometimes 
have to be optimized by the front- 
end tool because the back-end 
server lacks optimization. Oracle 
and Netware SQL, for example, do 
not implement cost-based (statisti¬ 
cal-based) optimization techniques. 
For these servers, the front-end 
must analyze the database struc¬ 
ture and try to determine the cor¬ 
rect join syntax (for example, order 
of table names, whether to use an 
ORDER BY, and so on) to improve per¬ 
formance. Often, this analysis is 
not possible because insufficient 
data is available, and performance 
will be very poor. 

The forms tool must handle 
nonstandard SQL commands, such 
as outer joins and null values, and 
then make two decisions: 

□ Whether the back-end sup¬ 
ports the command (such as an 
outer join). If not, then: 


□ Use special techniques to 
emulate the necessary function. 
The software must be able to re¬ 
trieve results in a consistent man¬ 
ner no matter how the database 
vendor has implemented the oper¬ 
ation (for example, outer joins, null 
value operations, and so on). 

The forms tool should also be 
able to use the database server's 
extended SQL language that in¬ 
cludes procedural logic if one is 
available (examples are SQL Serv¬ 
er's Transact-SQL, Oracle's PL/SQL, 
or Ingres/4GL). Similar support 
should be available for any ex¬ 
tended built-in functions (for ex¬ 
ample, math, statistical, data/time, 
and decode functions). 

Another area to look at is 
datatype support. Products such as 
the Microsoft/Sybase SQL Server 
and Interbase have BLOB fields 
that can contain up to two giga¬ 
bytes of data. Every database serv¬ 
er has its own set of datatypes, 
which may differ from the ANSI 
SQL datatypes. Each database serv¬ 
er has its own implementation of 
time and date data types, long in¬ 
tegers, floating point numbers, and 
so forth. It is important to investi¬ 
gate how well the front-end tool 
maps its own internal datatype sup¬ 
port to the back-end server's. 

In some cases, the front-end 
software will allow access to ex¬ 
tensions only if the programmer 
drops into a third-generation lan¬ 
guage (C, COBOL, and so on) pro¬ 
cedure. This restriction affects pro¬ 
ductivity and the developer's 
ability to make maximum use of 
the front-end tool and the data¬ 
base server's extended features. If 
you purchase a database server for 
specific, nonstandard features, make 
sure all of those features are acces¬ 
sible from the front-end tool's ba¬ 
sic facilities. 

□ Processing Result Sets. 

Forms tools can retrieve either a 
single row or sets of rows from the 
database tables, depending upon 
the SQL command issued. The re¬ 
trieved rows are called the result 
set. If the result set is large, the 
database server retrieves enough 
rows to fill the workstation buffers 
and waits for the front end to re¬ 
quest more rows from the result 
set. 

The amount of local buffer¬ 
ing can sometimes be controlled 
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The exciting breakthrough is here... 
Enterprise Data Access/SQL (EDA/SQL), 
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Warehouse framework. EDA/SQL is a family 
of client/server products that works 
together to extend the reach of SQL based 
tools and programs for accessing both new 
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and other non-relational DBMSs and files. 
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tools like Lotus 1-2-3 and QMF - even 3GL 
applications - can use native commands and 
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proprietary databases and files. 

Now, you can have a true open network 
architecture. Integrate new and existing 
hardware and network configurations with 
EDA/SQL’s interlocking communications 
components that support most major 
network architectures. You can design the 
most cost-effective environment of inter¬ 
connecting PCs, workstations, midrange and 
mainframe platforms. And have complete 
control over the data access environment 
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by front-end tool parameters. The 
parameters let the developer opti¬ 
mize local buffering for the kind 
of processing demanded by the 
application. Some front-end tools 
require that the full result set be 
retrieved before processing can 
continue. This requirement is a 
problem for transactions that pro¬ 
cess large result sets, because users 
must wait for all rows to be re¬ 
trieved, and result set data must be 
stored in temporary local files, im¬ 
pacting performance and user re¬ 
sponse time. 

It is important for applica¬ 
tions to use any available back¬ 
ward- and forward-scrolling features 
efficiently. Backward and forward 
scrolling may be offered within lo¬ 
cally buffered rows or for the en¬ 
tire result set. If a database server 
does not offer these features or if 
the forms tool supports these 
mechanisms, the application must 
retrieve the full result set and re¬ 
position itself within the result set 
whenever it scrolls backward. Reex¬ 
ecuting an SQL command and re¬ 
reading a result set is very time- 
consuming and resource-intensive, 
so it should be avoided whenever 
possible. A good forms tool will 
have a customized interface, which 
will use a server's scrolling capa¬ 
bilities whenever possible. 

□ Using Stored Procedures. 
Stored procedures are a method of 
storing complete or partial transac¬ 
tions in the server. Stored proce¬ 
dures reduce network communica¬ 
tions overhead by minimizing 
client-to-server interactions. Trig¬ 
gers are specialized stored proce¬ 
dures that execute whenever a data¬ 
base table is updated or an event 
occurs in the database. The forms 
tool should also be able to access 
messages and return trigger codes 
whenever they are activated by the 
database server. 

Forms tools should be able to 
use stored procedures from the ap¬ 
plication whenever they are avail¬ 
able. Some forms tools, such as 
Uniface, not only let users access 
stored procedures but will auto¬ 
matically generate stored proce¬ 
dures and triggers to increase per¬ 
formance and enforce referential 
integrity rules. This task is trans¬ 
parent to the applications developer. 

The method for executing 
and interacting with stored proce¬ 



dures differs from server to server. 
With some servers, stored proce¬ 
dures are executed with an SQL 
command. Other servers use a 3GL 
call-level interface. Depending upon 
the form tool's capabilities, it may 
be able to interface to one or both of 
these types of stored procedures. 

QUERY TOOLS 

Query and decision-support tools 
are primarily concerned with re¬ 
trieving and formatting data (as 
opposed to updating data). They 
must have well-designed inter¬ 
faces, good functionality, and 
good query optimization. As with 
forms tools, it is important to de¬ 
cide whether the application will 
be in a GUI- or character-based 
environment. 

At this time, only a limited 
number of client-based query and 
report writers use Windows or 
other GUIs. One of the more pop¬ 
ular products is Quest from Gupta 
Technologies. Quest is an integrat¬ 
ed query /reporting tool that easily 
integrates with Gupta's SQLWin¬ 
dows forms development tool. 
Quest can access Gupta's SQLBase 
database server directly through 
SQLBase's API, or it can access other 
database servers (for example, SQL 
Server and DB2) using Gupta's SQL- 
Network routers. SQLNetwork 
routers are an additional per- 
workstation cost, so if the applica¬ 
tion will be accessing a database 
server that needs routers, this ad¬ 
ditional cost should be factored 
into the product evaluation. 

Another popular GUI prod¬ 
uct is Channel Computing's Forest 
& Trees, which is also available in 
a character-based version. Forest & 
Trees accesses both SQL databases 
and other types of file formats, 
such as .DBF (dBASE) and ASCII 
files. Forest & Trees is primarily 
designed to monitor events in a 
database and signal the user that 


the event has occurred; however, 
it does have a report writer that 
can be used for simple reports. 

Quest's query and report 
writers are designed for users and 
developers and are capable of han¬ 
dling simple and complex prob¬ 
lems. Forest & Trees is designed 
more for end-user monitoring and 
reporting and can access a variety 
of databases and file formats. How¬ 
ever, it does not have Quest's 
breadth of reporting capabilities. 
In many cases, it is appropriate to 
use these tools side-by-side in a 
Windows environment. 

Most Windows forms devel¬ 
opment tools do not have compan¬ 
ion report writers. If you choose 
one of these tools, it may be possi¬ 
ble to use products like Quest or 
Forest & Trees to complete the ap¬ 
plication, or it may be necessary to 
use a character-based query/report 
writer running in a DOS window. 
This arrangement may be awk¬ 
ward for users and the application 
will lack a seamless GUI feel, but it 
may be the only alternative avail¬ 
able. If a character-based product 
is being evaluated, make sure it is 
thoroughly tested under Windows 
to ensure that functionality and 
stability are not lost. 

A good source of character- 
based report writers are DOS 
DBMS products with SQL database 
server links such as Paradox, Data¬ 
Base, dBASE IV, and Revelation. 
The query and report writing tools 
that come in these PC DBMSs are 
normally used in conjunction with 
companion forms tools that are 
packaged into the product. In com¬ 
paring these tools, some areas you 
may want to compare are: 

□ Ease of Connection. Con¬ 
nection and set-up procedures 
should be simple and as transpar¬ 
ent as possible to the user. User in¬ 
teraction with database servers 
should be the same as working with 
native file formats. 

□ SQL Support. Most of these 
products support a direct SQL pass¬ 
through mechanism. This device 
lets users enter an SQL command 
and send it directly to the database 
server. Developers need an SQL 
pass-through to test out SQL com¬ 
mands that will be included in an 
application. Users may need this 
facility to enter complex queries 
that can only be expressed in the 
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SQL command language. 

□ Query Interface. Users gen¬ 
erally prefer to avoid the SQL 
command language, if for no other 
reason than it is cumbersome to 
write out complete commands for 
every query. Most products offer 
query tools that shield users from 
SQL for simple and modestly com¬ 
plex queries (certain types of que¬ 
ries may be too complex to gener¬ 
ate from the query interface). Query 
tools can be classified in three 
groups: query-by-forms, query-by¬ 
example, and prompted query. 

Query-by-forms displays a 
form to the user that contains all 
(or some) of the fields contained 
in a table. The user enters values 
in the forms fields that act as the 
query criteria (that is, the AND con¬ 
ditions in the SQL WHERE clause). 
This approach is easy to use and 
works well if the user wants to see 
one record at a time. It is limited to 
queries that can be expressed with 
AND conditions (each field condi¬ 
tion represents an AND condition in 
the generated SQL WHERE clause). It 
is difficult, if not impossible, to ex¬ 
press OR or "nested" conditions 
(conditions that require parenthe¬ 
ses) using this approach. 

Query-by-example (QBE) type 
interfaces display field names across 
the top of the screen in a "table" 
format. The user enters conditions 
(examples) under each column 
name. Conditions can be simple or 
complex. Multiple tables are 
joined together by specifying 
"links" between example tables. 
QBE is used on mainframes and 
popular PC products such as Para¬ 
dox. It can handle complicated 
queries (though there are limita¬ 
tions such as "nested" queries) and 
is relatively easy to learn. Howev¬ 
er, it is cumbersome to handle 
long records using this approach 
as opposed to the query-by-forms 
approach. Users must use left- 
right scrolling to enter criteria and 
view fields, while query-by-forms 
screens can display most or all 
fields on a single screen. The QBE 
interface is more difficult to learn 
than the query-by-forms approach, 
but this fact is counter-balanced by 
the increased functionality. 

Prompted query interfaces 
generate SQL commands by ask¬ 
ing users to choose tables, fields, 
join conditions, and other selec¬ 



tion criteria from pop-up menus. 
Using this point-and-click ap¬ 
proach, it is possible to create sim¬ 
ple and complex queries—even 
nested SQL queries. It is relatively 
simple to learn, and users have an 
opportunity to view the SQL com¬ 
mand as it is being developed. It 
takes more time to create queries 
using prompted query because of 
the number of menus that must be 
accessed to create one query. 

Report writers also differ in 
capabilities and ease of use. Some 
report writers use a WYSIWYG in¬ 
terface to paint the report format, 
column headings, and field posi¬ 
tions. Others use a command for¬ 
mat, which directs the report writ¬ 
er with a series of format and in¬ 
put and output commands. The 
WYSIWYG approach is generally 
easier to use, while the command- 
driven approach offers more func¬ 
tionality. Quest is good example of 
a product that has excellent func¬ 
tionality and uses a WYSIWYG ap¬ 
proach. SQR from SQL Solutions is 
a very popular report writer that 
uses a command approach and can 
access a variety of SQL database 
servers. 

DECISION-SUPPORT TOOLS 

More and more decision-support 
tools are available for database 
servers. Spreadsheets are common¬ 
ly used as front-ends for SQL data¬ 
base servers. Microsoft's Excel uses 
a database connection product called 
Q+E to communicate with the serv¬ 
ers. Quattro uses Borland's SQLink 
product. Lotus has two products 
available for linking 1-2-3 to data¬ 
base servers. Lotus Add-In is Lo¬ 
tus's original connectivity product. 
Add-In has been supplanted by 
DataLens, which is a more gener¬ 
alized and strategic product. De¬ 
pending upon the database server, 
you may have a choice between 
using DataLens or Add-In, or you 


may have to use Add-In only. All 
of these products work well, and 
the choice will depend upon which 
spreadsheet your organization de¬ 
cides to standardize. 

Quattro has some advantages 
in that it uses SQLink (the same 
link that Paradox uses to access 
database servers) and can exchange 
information with Paradox. In this 
respect, Quattro offers a nice, 
seamless, workplace environment. 
All spreadsheet products should 
be evaluated for their ease-of-set- 
up and how well they can opti¬ 
mize SQL commands for the target 
database. If users will be updating 
the database from the spreadsheet, 
the spreadsheet software link 
should be evaluated for how well 
it uses the database server's trans¬ 
action management and integrity 
features. 

GETTING STARTED 

Database server vendors usually 
keep a list of front-end products 
that support their servers. This list 
is a good place to start when look¬ 
ing for front-end tools, especially 
if you are not sure what is avail¬ 
able in each category. 

Front-end tools for database 
servers are maturing rapidly. Ven¬ 
dors have learned how to better 
optimize their products for various 
database servers. It is important to 
test out all products thoroughly 
under conditions that approximate 
production loads. Client/server 
products tend to have more prob¬ 
lems under heavy loads, and com¬ 
munications links have a tendency 
toward breakdowns under stressed 
situations. Behavior varies from 
one front-end product to another 
and is very much dependent upon 
the database server and network 
facilities being used. 

If you cannot do your own 
stress testing, the next best thing 
you can do is find references who 
are using the tools in an environ¬ 
ment similar to yours in hardware, 
software, and network. This input 
will be invaluable for deciding 
how many users can be supported 
concurrently and what types of 
applications can be developed. 1111 

Richard Finkelstein is president of 
Chicago-based Performance Computing 
Inc., a consulting firm specializing in 
client/server product selection and ap¬ 
plications deployment. 
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BY HERBERT A. EDELSTEIN 


Responding to calls for greater user empowerment and the pressing 
need for better price/performance ratios has MIS professionals 
exclaiming, "Oh my!' 


D 


OWNSIZING IS OF- 
ten portrayed as re¬ 
placing mainframes with micro¬ 
computers, but a moment's thought 
quickly tells us that there's much 
more to downsizing than mere 
substitution. Picture a multimil¬ 
lion dollar IBM 3090 with 256 me¬ 
gabytes of memory, 200 gigabytes 
of disk storage, and 5,000 terminals 
connected to it. Someone comes in 
and disconnects the disk drives 
and terminals. A fleet of forklift 
trucks removes the mainframe cabi¬ 
nets (remembering to turn off the 
water) and wheels in a PC, to which 
the disks and terminals are then 
attached. It's a pretty ludicrous 
picture. 

In fact, downsizing is only 
one result of the most important 
change ever in the way computers 
are used: the migration of applica¬ 
tions from centralized computers 
to networks of distributed comput¬ 
ers. This move is a consequence of 
a new way organizations use data, 
coupled with a change in comput¬ 
ing technology. Each change rein¬ 
forces the other. 

THE PC FACTOR 

Mainframe data was originally 
used for the company's operation¬ 
al systems: accounting, payroll, in¬ 
ventory, or personnel. Even after 
data entry shifted to an online en¬ 
vironment, most of this data was 
still used in a batch mode for tasks 
such as reports or check writing. 
Although development of the mini¬ 
computer moved this business 
processing out to departments and 
branches, data processing, for the 
most part, has remained batch- 
oriented. 

The advent of the PC dra- 


Lions, 

Tigers, and 
Downsizing 


matically increased the analytical 
uses of the computer, and signifi¬ 
cant amounts of data were moved 
to decentralized locations to facili¬ 
tate this analysis. Unfortunately, 
the data quickly became outdated, 
and demands for access to the 
operational data could not easily 
be met without hurting the perfor¬ 
mance or risking the integrity of 
many of the central mainframe ap¬ 
plications. Also, superior PC-based 
interfaces made users less tolerant 
of 3270 screens. Mainframe re¬ 
sources, however, were strained 
by attempts to make 3270 screens 
as interactive and user-friendly as 
those on PCs. This difficulty accel¬ 
erated the shift to departmental 
computing, which increasingly re¬ 
lied on such computers as Digital 
Equipment Corp.'s VAX, IBM's AS/ 
400, or networked PCs. 

Another major motivation for 
decentralization was cost: Mini¬ 
computers and PCs were less ex¬ 
pensive than mainframes. In part, 
the difference in purchase price was 
because the mainframe included a 
degree of service (such as on-site 


systems engineers) that didn't come 
with the smaller computers. But an 
even larger factor was the degree of 
competition in the mini and PC are¬ 
na that fueled rapid improvements 
in price/performance ratios, as well 
as the relatively easy availability of 
third-party peripherals. In fact, 
much of what is called downsizing 
is simply off-loading mainframe 
applications to distributed smaller 
mainframes and networks. 

Up until the last few years, 
the quality of networks and inter¬ 
faces limited the move to distrib¬ 
uted, interactive systems. But re¬ 
cent improvements to products such 
as Novell's Netware 386 and Mi¬ 
crosoft's LAN Manager, as well as 
the advent of distributed comput¬ 
ing protocols, have corrected this 
problem. Add to this problem the 
increased presence of UNIX in the 
commercial world and more sophis¬ 
ticated graphical user interfaces 
(such as Windows 3.0 and Motif), 
and we have a critical mass neces¬ 
sary for greater migration. 

Of course, obstacles still exist 
that make downsizing difficult. It 
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remains much easier to find main¬ 
frame COBOL programmers than 
good C programmers with client/ 
server experience. The tools for 
managing networks are still in 
their early stages compared to the 
maturity of mainframe tools. And 
the difficulty of a mixed vendor 
environment with interfaces from 
PCs to local and wide area net¬ 
works (LANs and WANs) to main¬ 
frames creates many opportunities 
for problems. 

This article will examine the 
role of client/server computing in 
downsizing. We will consider what 
organizations can expect from a 
downsizing effort, and offer some 
suggestions on how to accomplish 
your goals. 

MAINFRAME VS. PERSONAL 

Let's consider once again the ma¬ 
jor differences between mainframe 
and personal computing. Main¬ 
frame computing is characterized 
by centralized, multiuser comput¬ 
ing, in which the user interacts ei¬ 
ther through a batch job or a ter¬ 
minal. A systems administration 
staff's duties include software in¬ 
stallation, data backup, and securi¬ 
ty. Personal computing, on the 
other hand, is characterized by de¬ 
centralized, single-user computers, 
and users perform their own sys¬ 
tems administration. Notice that 
these definitions are independent 
of the hardware platform. Thus, if 
you have a mainframe for your ex¬ 
clusive use, then that is personal 
computing. If your three-person 
workgroup is sharing an old IBM 
AT, then that is called mainframe 
computing. 

Downsizing is based on a 
move to distributed systems in 
which client/server architectures 
are central. Client/server comput¬ 
ing is a hybrid of mainframe and 
personal computing. The clients 
are single-user workstations ad¬ 
ministered for the most part by 
users, and are closest to the per¬ 
sonal computing paradigm; where¬ 
as the servers are multiuser, central¬ 
ized computers administered by a 
separate staff, and are closest to the 
mainframe paradigm. 

Let's look more closely at the 
contrast between mainframe and 
client/server computing. In tradi¬ 
tional host-based architectures, the 
database and applications run on a 


Downsizing is 
based on a move 
to distributed 
systems 

single host computer, frequently 
sharing the same memory. Main¬ 
frame computing is heavily orient¬ 
ed toward batch processing, using 
nonprogrammable workstations for 
inputting data and displaying the 
results. 

The advantage of this ap¬ 
proach is that it is easy to central¬ 
ize administration and control the 
database. Furthermore, the tech¬ 
nology for developing, deploying, 
and maintaining these applications 
is quite mature and stable. Appli¬ 
cations run on inexpensive net¬ 
works such as IBM's Systems Net¬ 
work Architecture, using 3270 data 
streams or asynchronous terminals. 
The amount of network traffic is 
relatively limited since only the 
actual data needed is shipped in 
either direction. 

However, these host-based 
applications have a number of dis¬ 
advantages. They require very pow¬ 
erful computers that support not 
only the DBMS but a multitude of 
applications and users. Each new 
application or user increases the 
demand for resources and quickly 
saturates even a large mainframe. 
Centralized control has also proved 
too unresponsive to the needs of 
users, particularly those who want 
to analyze the data or create ad- 
hoc reports. Furthermore, user in¬ 
terfaces of host-centered applica¬ 
tions are often primitive compared 
to the best of the PC interfaces; 
and, because the interface is cus¬ 
tomized to a particular application, 
training costs are high. 

The solution to these diffi¬ 
culties is based on distributing 
computing power and data closer 
to the ultimate users; in other 
words, using cooperative process¬ 
ing. In cooperative processing, por¬ 
tions of an application run on dif¬ 
ferent computers, with data located 
at one or more sites. The simplicity 
of this definition hides two major 
issues: first, how the programmer 
logically divides the work; and 
second, finding a mechanism that 


enables programs to work together 
efficiently. 

CLIENT/SERVER 

Client/server architectures, a sub¬ 
set of cooperative processing, ad¬ 
dress the first issue: the logical 
split of functionality. The user 
portion of an application runs on 
one or more client computers, and 
works cooperatively with the sys¬ 
tems portion running on one or 
more server nodes. The user code 
running on the client generally 
handles functions including pre¬ 
senting and formatting data, data 
entry procedures, transaction log¬ 
ic, business rules and models, ap¬ 
plication flow control, reporting, 
and so forth. 

Database servers are central 
to almost all downsized applica¬ 
tions. The database functionality 
runs on a server, which provides 
the data dictionary (where data¬ 
base definitions reside), manages 
concurrency, handles journaling 
and recovery, regulates security, 
controls transactions, optimizes data 
access, and performs all the other 
database tasks. The data and DBMS 
may stay on the mainframe (which 
we would now call the server). But 
sometimes, because the application 
tasks have been off-loaded to the 
clients, a smaller computer can be 
substituted. 

Some products that classify 
themselves as database servers are 
really providing cooperative pro¬ 
cessing—they let user application 
functions reside on the server. The 
mechanism most commonly used 
is called stored procedures. Stored 
procedures let the user write pro¬ 
cedural logic containing SQL state¬ 
ments, name these procedures, com¬ 
pile them, and store them as part 
of the database itself. To perform 
this task, these products require ex¬ 
tensions to standard SQL. SQL Serv¬ 
er (from Sybase and Microsoft), In- 
terbase (from Borland International), 
and Ingres (from ASK/Ingres) sup¬ 
port stored procedures. The cur¬ 
rent releases from Oracle Corp. and 
IBM do not have this capability 
yet, but virtually all DBMS ven¬ 
dors intend to add it in the future. 

One of the main benefits of 
client/server architecture is that it 
uses available network bandwidth 
effectively. In this arrangement, a 
high-level language (typically SQL) 
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is used to request the desired data. 
Rather than needing to ask for in¬ 
dividual records, a single SQL state¬ 
ment can request data from many 
tables. The intelligence at the serv¬ 
er can perform a significant amount 
of processing; therefore, rather 
than transferring entire files, only 
selected information is transferred. 
Local copies of data can be used 
for a variety of tasks, such as per¬ 
forming data entry validation (al¬ 
though this activity can raise some 
synchronization issues). Stored 
procedures reduce network traffic 
even more, since a complex proce¬ 
dure that already resides on the 
server is invoked by name instead 
of being sent to the server. 

WHAT TO DOWNSIZE? 

Although cost savings are the 
main goal of many organizations 
trying to downsize, they are diffi¬ 
cult to obtain. We often read in the 
press of how an application on an 
older mainframe (commonly an 
IBM 4381) has been reimplement¬ 
ed in a networked PC environment 
at a significant savings of operat¬ 
ing costs; however, these write-ups 
rarely give a complete accounting 
of the costs for the new hardware 
and software. Most cost justifica¬ 
tion analyses are good at showing 
where money will be saved from 
the old system, but are less accu¬ 
rate in estimating the costs of the 
new systems. 

The real reason to downsize is 
to benefit from moving an old ap¬ 
plication that is clearly at the end of 
its life cycle to modern technology 
and a new design that reflects the 
organization's current needs. Cost 
savings are not so much the result 
of downsizing, but rather of taking 
an old application on older, more 
expensive hardware and reimple¬ 
menting it on modern hardware. 
The cost of implementation will be 
lower than it used to be because of 
today's improved techniques, and 
the new hardware is far cheaper. 
Furthermore, the new implementa¬ 
tion will be better at solving the 
current problems than the applica¬ 
tion it replaces. 

Downsizing is frequently an 
incremental task, involving a spe¬ 
cific application or group of appli¬ 
cations that use a particular data¬ 
base. It is best to choose those 
applications for which the data 


Client/server 
computing has its 
benefits as well 
as its risks 

can be cleanly separated from the 
mainframe database; otherwise, you 
will have the problem of redun¬ 
dant data that must be kept syn¬ 
chronized. The easiest applications 
to downsize are those that can be 
localized so that they run on a sin¬ 
gle LAN of reasonable size, and 
the database can reside on a single 
local server. 

Another important question 
to consider is how the data is used. 
Most likely, some applications will 
need access to remote data in addi¬ 
tion to their own local data. Rather 
than create redundant data, it may 
be preferable to maintain a cen¬ 
tralized DBMS that is accessed by 
these remote applications as need¬ 
ed. The choice between remote 
copies of data and centralized data 
depends largely on the frequency 
of updates, the number of updat¬ 
ing sites, and the need to keep the 
copies of data synchronized. If the 
updates are frequent or from mul¬ 
tiple sites, then centralization is 
better; if they are infrequent, then 
redundant data may be preferable. 
If the distributed applications can 
tolerate a period when they don't 
have the most current data, then 
the central site can function as a 
master database from which peri¬ 
odic snapshots are taken to update 
the subordinate databases on the 
LAN. 

Given the present limitations 
of distributed database products, 
try to avoid those situations in 
which replicas of data must be 
kept in synchronization. Solutions 
to that problem are complex and 
involve a great deal of overhead. 

Applications that can benefit 
from the documented productivity 
gains of a graphical user interface 
(GUI) are also good candidates for 
downsizing. Real evidence exists 
that GUIs can help save money. 
Their consistency makes applica¬ 
tions easier to learn; GUIs also re¬ 
duce training and support require¬ 
ments. In addition, an improvement 
in quality comes from users being 


able to create better-looking reports 
—frequently with graphics—due 
to the similarity between print 
and imaging models that facilitate 
WYSIWYG (what you see is what 
you get). 

A constant concern for many 
mainframe users has been systems 
capacity, often necessitating a 
large, expensive step to the next 
size platform. In some cases, they 
even have to wait for the new, 
larger system to be designed and 
built. In a distributed environ¬ 
ment, the incremental addition of 
users and applications puts less of 
a burden on the servers because 
much of the actual work is still be¬ 
ing done on client computers. 
Therefore, applications that are run¬ 
ning out of gas in a traditional main¬ 
frame computing environment are 
good candidates for downsizing. In 
fact, simply moving to a client/ 
server architecture and retaining 
the mainframe as the server can ex¬ 
tend the mainframe's usefulness. 

Notice that these benefits do 
not result from moving to smaller 
computers, but from moving to 
distributed architectures in which 
the servers are sized for the appli¬ 
cation. While off-loading the cli¬ 
ent functions to a workstation re¬ 
duces the size requirement for the 
server, the server must still be big 
enough to handle all the clients, 
the DBMS, and the portions of the 
application logic that run on the 
server. 

DOWNSIZING CAUTIONS 

While client/server computing has 
significant benefits, it is not with¬ 
out risks. Client/server computing 
is intrinsically more complex than 
centralized host computing. For 
example, if DB2 is the database 
server, a long chain goes from the 
client application to the server and 
back again. The query is typically 
formulated on the client worksta¬ 
tion, travels across the LAN to a 
communications gateway, into a 
CICS transaction, then into DB2, 
and back again. Along the way, it 
may have to pass through main¬ 
frame security subsystems. The 
number of tools available for de¬ 
bugging problems in this chain is 
depressingly small, and the oppor¬ 
tunity for finger pointing is very 
large. 

In downsizing, it is impor- 
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tant to pick a suitable server. Some 
real questions exist about whether 
PC hardware is yet suitable for the 
multigigabyte applications-com¬ 
mon in the mainframe world. The 
scale of many such applications in 
terms of throughput, number of 
users, database size and complex¬ 
ity, transaction volume and com¬ 
plexity, and response time require¬ 
ments is beyond the capacity of 
even the largest microcomputer 
servers. For such applications, the 
developer must seriously consider 
larger servers, such as a DEC VAX 
or an IBM 3090. On machines like 
these, memory can exceed 500MB, 
disk storage may be hundreds of 
gigabytes, and I/O bandwidths 
can be greater than 100MB per sec¬ 
ond. The downsizing difference is 
in the style of computing, not nec¬ 
essarily in the size of the server 
itself. 

Another reason for looking 
at mainframes and minis as servers 
is that while a PC-based system 
may run out of steam as an appli¬ 
cation grows, more powerful serv¬ 
ers are usually available in the line¬ 
up of mainframes. It's very clear 
that all the hardware vendors have 
recognized this fact. IBM and DEC, 
for example, are positioning their 
computers as servers in networked 
systems. 

When choosing a server, the 
problem of the operating system 
also exists. MS-DOS is not ade¬ 
quate for most serious applications 
because of limitations in memory 
management and interprocess com¬ 
munications, which means choos¬ 
ing among OS/2, some version of 
UNIX, or a vendor's own operat¬ 
ing system. Forthcoming operating 
systems such as Microsoft's Win¬ 
dows NT or Apple/IBM's "Pink" 
may also be suitable. 

The future for PC DBMS ap¬ 
plications, such as those built with 
Paradox, dBASE, and dBASE de¬ 
rivatives (such as FoxBase) should 
also be examined. As mainframe- 
type applications move to client/ 
server architectures on PC net¬ 
works using "industrial strength" 
DBMSs such as SQL Server, Ingres, 
Oracle, or InterBase, users of the 
PC DBMSs must consider whether 
to migrate applications to the data¬ 
base server. This consideration is 
particularly true for applications 
that are multiuser, have large data- 
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bases, and use data from the server 
database. Vendors such as Borland 
International are developing inter¬ 
faces to a variety of database serv¬ 
ers. Therefore, it is sensible to con¬ 
sider migrating PC applications 
nearing the end of their life cycles 
to database servers, particularly if 
much of the front end can still be 
used with the server interface. 

HETEROGENEITY WITH SQL 

There is another more important 
reason why mainframes aren't dead: 
because data resides on them. Most 
essential corporate data resides in 
mainframe databases managed by 
DB2, IMS, CA-IDMS, VSAM, or any 
of a number of other products. The 
effort required to convert this data 
and the billions of dollars of appli¬ 
cations that use it guarantees that 
data and applications will use this 
technology well into the next 
century. 

Consequently, it is quite like¬ 
ly that a downsized enterprise will 
be moving to an increasingly het¬ 
erogeneous environment, with mul¬ 
tiple types of hardware, operating 
systems, networks, and DBMSs. In 
such a world, users and systems 
people quickly learn that there is 
far more to "open systems" than 
UNIX. Interoperability can be a 
very difficult problem to solve. 

SQL is the Esperanto of the 
database world, but beware: Some 
vendors add SQL to a DBMS and 
pronounce it the solution to the 
interoperability problem. Unfortu¬ 
nately, complications occur. Not 
only are there differences among 
most SQL implementations, but oth¬ 
er differences in the target DBMSs, 
such as the concurrency and lock¬ 
ing models, may cause problems. 

Gateways are an important 
part of the solution to the hetero¬ 
geneity problem. A gateway takes 
the SQL for a particular DBMS, 
translates it to the appropriate lan¬ 
guage for the target DBMS, and 


sends it to the remote computer, 
thus enabling client/server com¬ 
puting in a heterogeneous envi¬ 
ronment. The gateway must know 
the difference between the native 
DBMS and the new target. Some 
notable gateways include Micro 
Decisionware's Database Gateway, 
which lets programs using the SQL 
Server application programming 
interface access DB2 mainframe 
programs; Sybase's Open Server, 
which is used to build gateways 
not only to DBMSs but to remote 
applications; Information Builders 
Inc.'s EDA/SQL, which is the data¬ 
base access component of IBM's 
Information Warehouse; and the 
ASK/Ingres Gateways. 

The implementors of distrib¬ 
uted applications must also be con¬ 
scious of the potential problems 
caused by security. While no one 
ever walked off with a mainframe 
disk farm, a PC-based server is dis¬ 
tressingly portable. In addition, 
data is vulnerable to interception 
when transmitted. While physical 
protection of the data is a function 
of site security, the DBMS vendors 
have added significant security fea¬ 
tures, and some have conformed 
with U.S. Department of Defense 
requirements for secure databases. 

Lastly, in a distributed archi¬ 
tecture, data integrity must be cau¬ 
tiously guarded. As opposed to a 
centralized mainframe where the 
DBA group carefully ensures data 
integrity, a downsized environment 
has data residing in many locations, 
with updates coming from remote 
sites and from sources other than 
carefully constructed programs. In 
these cases, database triggers (proce¬ 
dures within the database automati¬ 
cally executed when an update oc¬ 
curs) are extremely important. 

MANAGERIAL ISSUES 

Testing and maintaining down¬ 
sized applications may be much 
more difficult than with the main¬ 
frame versions that have been re¬ 
placed. When a major new version 
of an application is brought up on 
a mainframe, a period of extensive 
testing is conducted before de¬ 
ployment. Also, new documenta¬ 
tion must be written and users 
trained. Minor enhancements or 
fixes still require testing, but their 
documentation and training bur¬ 
den may be nonexistent. 
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A networked environment is 
not so straightforward. Suppose a 
mainframe application with 100 
users has been off-loaded to four 
PC-based LANS. Now there are 
four servers and 100 client work¬ 
stations. A new release must be 
tested on every server and client 
variation. To minimize this prob¬ 
lem, it may be possible to ensure 
that the servers are all identical 
(although undocumented differ¬ 
ences in so-called identical com¬ 
puters can in fact produce unfore¬ 
seen problems). It is much more 
difficult to ensure that the clients 
are all uniform, including not only 
the hardware, but also the system 
software such as configuration and 
boot files. A number of organiza¬ 
tions are using diskless worksta¬ 
tions that boot from the server to 
minimize the likelihood of subtle 
differences among clients. Never¬ 
theless, as time goes on, a small 
percentage of clients will probably 
require staff attention to update. 

The physical burden of send¬ 
ing out updates can be quite large. 
Products such as IBM's Distribu¬ 
tion Manager, Spectrum Concepts' 
Xcom/Software Distribution Sys¬ 
tem, and Tangram's AM:PM are 
designed to simplify the release of 
new software across networks of 
clients and servers. Unfortunately, 
they require an MVS host from 
which the software is sent to net¬ 
works. Still, a pressing need exists 
for improved tools to manage a 
downsized environment. 

PROCEED WITH CAUTION 

Beyond any doubt, the major trend 
in computing today is toward dis¬ 
tributed systems of networked PC 
clients and PC servers, minicom¬ 
puter servers, or mainframe serv¬ 
ers. And because the benefits are 
quite real and significant, in the 
long run we will have better, more 
robust, and more usable systems. 
In the short term, however, we 
must proceed with reasonable cau¬ 
tion and realistic expectations in 
order to realize the payoffs that 
are available. 1111 
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BY M. TAMER OZSU AND PATRICK VALDURIEZ 


If they meet expectations , distributed DBMSs will replace centralized 
systems in many applications. But unsolved technical problems remain 


Distributed 
Database! 



Systems 


ISTRIBUTED DATA- 
base technology is one 
of the most important comput¬ 
ing developments of the past dec¬ 
ade. During this period, distribut¬ 
ed database research has been 
intense, culminating in the release 
of a number of first-generation 
commercial products. If it meets 
expectations, distributed database 
technology will impact data pro¬ 
cessing the same way centralized 
systems did a decade ago. Indeed, 
some observers claim that within 
the next 10 years most organiza¬ 
tions will move toward distributed 
database managers, and central¬ 
ized database managers will be¬ 
come an antique curiosity. 1 

With the technology now at 
the critical stage of finding its way 
into commercial products, it is 


important to seek answers to the fol¬ 
lowing questions: 

□ What were the initial goals 
and promises of distributed data¬ 
base technology? How do current 
commercial products measure up 
to these promises? In retrospect, 
were these goals achievable? 

□ Have the important techni¬ 
cal problems already been solved? 

□ What technological changes 
underlie distributed data manag¬ 
ers and how will they impact the 
next generation of systems? 

The last two questions hold 
particular importance for research¬ 
ers because their answers determine 
the road map for research in com¬ 
ing years. Recent papers addressing 
these questions have emphasized 
scaling 2 and the introduction of het¬ 
erogeneity and autonomy. 3 These 


problems are important, but many 
others remain unsolved. Even such 
much-studied topics as distributed 
query processing and transaction 
management involve research prob¬ 
lems that have not yet been ad¬ 
dressed adequately. Furthermore, 
new issues arise as the technology 
changes, application areas expand, 
and we gain experience with the 
application of distributed database 
technology. 

WHAT IS A DISTRIBUTED 
DATABASE SYSTEM? 

A distributed database is a collec¬ 
tion of multiple, logically interre¬ 
lated databases distributed over a 
computer network. 4 A distributed 
database management system (dis¬ 
tributed DBMS) is the software sys¬ 
tem that permits the management 
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shareable. In fully autonomous sys¬ 
tems, however, the individual com¬ 
ponents are stand-alone DBMSs, 
which do not know whether other 
DBMSs exist or how to communi¬ 
cate with them. 

□ Distribution deals with data. 
We consider two cases: Either data 
is physically distributed over mul¬ 
tiple sites that communicate with 
each other over some form of com¬ 
munications medium or it is stored 
at only one site. 

□ Heterogeneity occurs in var¬ 
ious forms in distributed systems, 
ranging from hardware heteroge¬ 
neity and differences in network¬ 
ing protocols to variations in data 
managers. The important forms of 
heterogeneity from the perspec¬ 
tive of database systems are differ¬ 
ences in data models, query lan¬ 
guages, interfaces, and transaction 
management protocols. The taxon¬ 
omy classifies DBMSs as homogen¬ 
eous or heterogeneous. 

The alternative system archi¬ 
tectures based on this taxonomy are 
shown in Figure 2. The arrows at 
the ends of the axes do not indicate 
an infinite number of choices but 
simply the dimensions of the taxon¬ 
omy. This article deals mainly with 
tightly integrated, distributed, and 
homogeneous database systems. 

CURRENT STATE OF 
TECHNOLOGY 

Like any emerging technology, dis¬ 
tributed database systems have their 
share of fulfilled and unfulfilled 
promises. In this section, we con¬ 
sider the commonly cited advan¬ 
tages of distributed DBMSs and 
how well current commercial prod¬ 
ucts provide these advantages. 

TRANSPARENT DATA 
MANAGEMENT 

Centralized database systems have 
taken us from a data processing 
paradigm in which data definition 
and maintenance were embedded 
in each application to one in which 
these functions are abstracted from 
the applications and placed under 
the control of a server called the 
DBMS. This new orientation results 
in data independence—the immu¬ 
nity of application programs to 
changes in the logical or physical 
organization of the data and vice 
versa. Distributed database tech¬ 
nology extends the concept of data 


Most distributed 
DBMSs do not 
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transparency 

independence to environments in 
which data is distributed and rep¬ 
licated over a number of machines 
connected by a network. 

Data independence is pro¬ 
vided by several forms of transpar¬ 
ency: network (and, therefore, dis¬ 
tribution), replication, and fragmen¬ 
tation transparency. Transparent ac¬ 
cess to data separates a system's 
higher-level semantics from lower- 
level implementation issues. Thus, 
database users would see a logically 
integrated, single-image database, 
even though it was physically dis¬ 
tributed, enabling them to access 
the distributed database as if it were 
a centralized one. In its ideal form, 
full transparency would imply a 
query language interface to a dis¬ 
tributed DBMS no different from 
that to a centralized DBMS. 

Most commercial distributed 
DBMSs do not provide a sufficient 
level of transparency. Part of the 
problem is a lack of support for rep¬ 
licated data management. Some sys¬ 
tems do not permit data replication 
across multiple databases; systems 


that do permit it require that the 
user be physically logged on to one 
database at a given time. Some dis¬ 
tributed DBMSs attempt to estab¬ 
lish their own transparent naming 
schemes, usually with unsatisfac¬ 
tory results, requiring the users ei¬ 
ther to specify the full path to data 
or to build aliases to avoid long 
path names. An important aspect of 
the problem is the lack of proper 
operating system support for trans¬ 
parency. Network transparency can 
easily be supported by a transparent 
naming mechanism in the operat¬ 
ing system. The operating system 
can also assist with replication tran¬ 
sparency, leaving the task of frag¬ 
mentation transparency to the dis¬ 
tributed DBMS. 

Full transparency is not a uni¬ 
versally accepted objective. Gray ar¬ 
gues that full transparency makes 
distributed data management diffi¬ 
cult and claims that "applications 
coded with transparent access to 
geographically distributed databases 
have poor manageability, poor 
modularity, and poor message per¬ 
formance." 5 He proposes a remote 
procedure call (RPC) mechanism 
between requester users and server 
DBMSs whereby users would direct 
their queries to a specific DBMS. 

We agree that the manage¬ 
ment of distributed data is more 
difficult if transparent access is 
provided to users, and that the 
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FIGURE 2. Implementation alternatives. 
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client/server architecture with RPC- 
based communications is the right 
approach. In fact, some commer¬ 
cial distributed DBMSs are orga¬ 
nized in this fashion (Sybase, for 
example). However, the original 
goal of providing transparent ac¬ 
cess to distributed and replicated 
data should not be abandoned be¬ 
cause of the difficulties. The issue 
is, what should take over the re¬ 
sponsibility of managing distribut¬ 
ed and replicated data—the user 
application or the distributed DBMS? 
In our opinion, it should be the 
distributed DBMS, whose compo¬ 
nents can be organized in a client/ 
server fashion. The related techni¬ 
cal problems are among the re¬ 
maining research issues that must 
be addressed. 

RELIABILITY 

Distributed DBMSs should improve 
reliability, since they have replicat¬ 
ed components and thereby elimi¬ 
nate single points of failure. The 
failure of a single site, or a com¬ 
munications link failure that makes 
one or more sites unreachable, is 
not enough to bring down the en¬ 
tire system. (It is well-known that 
link failures may cause network 
partitioning and are therefore more 
difficult to deal with. However, the 
topic is beyond the scope of this ar¬ 
ticle.) In a distributed database, this 
failure means that some of the data 
may be unreachable, but with prop¬ 
er care, users can access other parts 
of the database. This proper care 
comes in the form of support for 
distributed transactions. 

A transaction consists of a se¬ 
quence of database operations, ex¬ 
ecuted as an atomic action that 
transforms a consistent database 
state to another consistent data¬ 
base state, even when a number of 
such transactions are executed con¬ 
currently (sometimes called concur¬ 
rency transparency), and even 
when failures occur (called failure 
atomicity). Therefore, a DBMS that 
provides full transaction support 
guarantees that concurrent execu¬ 
tion of user transactions will not 
violate database consistency in the 
face of system failures as long as 
each transaction is correct—that is, 
obeys the integrity rules specified 
for the database. 

Distributed transactions ex¬ 
ecute at multiple sites, where they 
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access the local database. With full 
support for distributed transac¬ 
tions, user applications can access 
a single logical image of the data¬ 
base and rely on the distributed 
DBMS to ensure that their requests 
will be executed correctly no mat¬ 
ter what happens in the system. 
Correctly means that user applica¬ 
tions need not be concerned with 
coordinating their accesses to indi¬ 
vidual local databases, nor need 
worry about the possibility of site 
or communications link failures 
during execution of their transac¬ 
tions. A link exists between dis¬ 
tributed transactions and transpar¬ 
ency since both involve distributed 
naming and directory management. 

Providing transaction sup¬ 
port requires the implementation 
of distributed concurrency control 
and distributed reliability proto¬ 
cols, which are significantly more 
complicated than their centralized 
counterparts. The typical distribut¬ 
ed concurrency control algorithm 
is some variation of the well- 
known two-phase locking (2PL) 
protocol, depending on the place¬ 
ment of the lock tables and the as¬ 
signment of lock management re¬ 
sponsibilities. Distributed reliability 



protocols consist of distributed com¬ 
mit protocols and recovery proce¬ 
dures. Commit protocols enforce 
atomicity of distributed transactions 
by ensuring that a given transaction 
has the same effect (commit or 
abort) at each site where it exists, 
whereas recovery protocols specify 
how global database consistency is 
to be restored following failures. In 
the distributed environment, the 
commit protocols are two-phase 
(2PC) protocols. In the first phase, 
an agreement is established among 
the various sites regarding the fate 
of a transaction. The agreed-upon 
action is taken in the second phase. 

Data replication increases 
database availability; copies of the 
data stored at a failed or unreacha¬ 
ble site exist at other operational 
sites. However, replica support re¬ 
quires the implementation of con¬ 
trol protocols that enforce speci¬ 
fied replica access semantics. The 
most straightforward semantic is 
one-copy equivalence, which can 
be enforced by the read one, write 
all (ROWA) protocol. In ROWA, a 
logical read operation on a repli¬ 
cated data item is converted to one 
physical read operation on any one 
of its copies, but a logical write op¬ 
eration is translated to physical 
writes on all copies. More compli¬ 
cated, less restrictive replica control 
protocols, based on deferring the 
writes on some copies, have been 
studied but are not implemented in 
any systems we know of. 

Concurrency control and 
commit protocols are among the 
two most studied topics in distrib¬ 
uted database research. Yet their 
implementation in commercial sys¬ 
tems is not widespread. The perfor¬ 
mance implications of implement¬ 
ing distributed transactions, which 
are not fully understood, make 
them unpopular among vendors. 
Commercial systems provide vary¬ 
ing degrees of distributed transac¬ 
tion support. Some (for example, 
Oracle Corp/s Oracle) require us¬ 
ers to have one database open at a 
given time, thereby eliminating the 
need for distributed transactions. 
Others (for example, Sybase Inc.'s 
Sybase SQL Server) implement the 
basic primitives necessary for the 
2PC protocol but require the user 
applications to coordinate the com¬ 
mit actions. In other words, the dis¬ 
tributed DBMS does not enforce ato- 
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micity of distributed transactions but 
provides the basic primitives by 
which user applications can enforce 
it. Other systems, however, imple¬ 
ment the 2PC protocols fully (ASK/ 
Ingres's Ingres and Tandem Com¬ 
puters' NonStop SQL, for example). 

BETTER PERFORMANCE 

The case for the distributed 
DBMS's superior performance is 
usually based on two points. First, 
a distributed DBMS fragments the 
conceptual database, enabling data 
to be stored in close proximity to 
its points of use. This feature, called 
data localization, has two potential 
advantages: Since each site handles 
only a portion of the database, con¬ 
tention for CPU and I/O services is 
not as severe as for centralized data¬ 
bases and, also, localization reduces 
remote access delays, which usually 
occur in wide area networks (for ex¬ 
ample, the minimum round-trip 
message propagation delay in sat¬ 
ellite-based systems is approximate¬ 
ly one second). Most distributed 
DBMSs are structured to gain maxi¬ 
mum benefit from data localization. 
Full benefits of reduced contention 
and reduced communications over¬ 
head can be obtained only through 
proper fragmentation and distribu¬ 
tion of the database. 

The second point in favor of 
the distributed DBMS's perfor¬ 
mance advantage is that the inher¬ 
ent parallelism of distributed sys¬ 
tems can be exploited for interquery 
and intraquery parallelism. Inter¬ 
query parallelism results from the 
execution of multiple queries at the 
same time. Intraquery parallelism is 
achieved by breaking up a single 
query into a number of subqueries, 
each executed at a different site, ac¬ 
cessing a different part of the dis¬ 
tributed database. 

If user access to the distribut¬ 
ed database consisted only of query¬ 
ing (read-only access), provision of 
interquery and intraquery paral¬ 
lelism would imply that as much 
of the database as possible should 
be replicated. However, since most 
database accesses are not read¬ 
only, the mixing of read and up¬ 
date operations requires the imple¬ 
mentation of elaborate concurrency 
control and commit protocols. 

Today's commercial systems 
use two alternative execution mod¬ 
els (other than the implementation 
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of full distributed transaction sup¬ 
port) to improve performance. The 
first alternative is to have the data¬ 
base open only for queries (read¬ 
only access) during regular operat¬ 
ing hours, while updates are 
batched. The database is then closed 
to query activity during off-hours, 
when the batched updates are run 
sequentially. This alternative is 
time-multiplexing between read ac¬ 
tivity and update activity. 

The second alternative is 
based on multiplexing the database: 
Two copies of the database are 
maintained, one for ad hoc query¬ 
ing (called the query database) and 
the other for updates by application 
programs (called the production 
database). At regular intervals, the 
production database is copied to the 
query database. This alternative 
does not eliminate the need to im¬ 
plement concurrency control and 
reliability protocols for the produc¬ 
tion database since these functions 
are necessary to synchronize the 
write operations on the same data; 
however, it improves the perfor¬ 
mance of queries since they can be 
executed without transaction ma¬ 
nipulation overhead. 

The performance characteris- 
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tics of distributed database systems 
are not well understood. Not 
enough true distributed database 
applications are available to pro¬ 
vide a sound basis for practical 
judgments. In addition, perfor¬ 
mance models are not sufficiently 
developed. The database commu¬ 
nity has developed a number of 
benchmarks to test the perfor¬ 
mance of transaction-processing 
applications, but it is not clear 
whether they can be used to mea¬ 
sure the performance of distribut¬ 
ed transaction management. The 
performance results of commercial 
DBMS products, even with respect 
to these benchmarks, generally are 
not openly published. NonStop 
SQL is one product for which per¬ 
formance figures, as well as the ex¬ 
perimental setup used in obtain¬ 
ing them, have been published. 

EASIER SYSTEM EXPANSION 

In a distributed environment, ac¬ 
commodating increasing database 
sizes should be easier. Major sys¬ 
tem overhauls are seldom neces¬ 
sary; expansion can usually be 
handled by adding processing and 
storage power to the system. We 
call this database size scaling , as op¬ 
posed to network scaling, which 
we will discuss later. It may not be 
possible to obtain a linear increase 
in power, since this increase also 
depends on the distribution over¬ 
head, but significant improvements 
are still possible. 

Microprocessor and worksta¬ 
tion technologies have played a 
role in improving economies. Many 
commercial distributed DBMSs op¬ 
erate on minicomputers and work¬ 
stations to take advantage of their 
favorable price-performance char¬ 
acteristics. Moreover, most com¬ 
mercial distributed DBMSs operate 
within local area networks, for 
which workstation technology is 
most suitable. The emergence of 
distributed DBMSs that run on 
wide area networks may increase 
the importance of mainframes. On 
the other hand, future distributed 
DBMSs may support hierarchical 
organizations in which sites con¬ 
sist of computer clusters communi¬ 
cating over a local area network, 
with a high-speed backbone wide 
area network connecting the 
clusters. 

Another economic factor is 





























the trade-off between data com¬ 
munications and telecommunica¬ 
tions costs. In the previous section, 
we argued that data localization 
improves performance by reduc¬ 
ing delays. It also reduces costs. 
Consider an application (such as 
inventory control) that needs to 
run at several locations. If this ap¬ 
plication accesses the database fre¬ 
quently, distributing the data and 
processing it locally may be more 
economical than executing the ap¬ 
plication at various sites and mak¬ 
ing remote accesses to a central 
database stored at another site. In 
other words, the cost of distribut¬ 
ing data and shipping some of it 
periodically from one site to the 
other to execute distributed que¬ 
ries may be lower than the tele¬ 
communications cost of frequently 
accessing a remote database. This 
part of the economics argument is 
still speculative. As we indicated 
before, most distributed DBMSs are 
local area network products, and 
how they can be extended to oper¬ 
ate in wide area networks is a topic 
of discussion and controversy. 


MORE TO COME 

Next month, we will discuss the 
unsolved problems of distributed 
DBMSs, and the new issues associat¬ 
ed with this changing technology. 1111 

®1991 IEEE. Reprinted with permission 
from Computer (Vol. 24, No. 8, pps. 68-78, 
August 1991). 

The authors would like to thank Abdel- 
salam Heddaya of Boston University, who not 
only reviewed the entire article and provided 
many comments, but also helped with the dis¬ 
cussion of replication, which is based on a draft 
he wrote. Thanks to Alex Biliris of Boston Uni¬ 
versity, and Michael Brodie, Alex Buchmann, 
Dimitrios Georgakopoulos, and Frank Manola, 
all of GTE Laboratories, who also read the entire 
manuscript and provided many suggestions re¬ 
garding content and presentation that improved 
the article significantly. 

M. Tamer Ozsu's research was partially 
supported by the Natural Sciences and Engi¬ 
neering Research Council of Canada under oper- 
ating grant OGP-0951. 

REFERENCES 

1. Stonebraker, M. Readings in Database 
Systems, Morgan Kaufmann, San Mateo, 
California, 1988, p. 189. 

2. Stonebraker, M. "Future Trends in 
Database Systems," IEEE Trans. Knowledge 
and Data Eng., 1(1): 33-44, March 1989. 

3. Garcia-Molina, H. and B. Lindsay, 
"Research Directions for Distributed Data¬ 


bases," IEEE Q. Bull. Database Eng., 13(4): 
12-17, December 1990. 

4. Ozsu, M.T., and P. Valduriez, Principles 
of Distributed Database Systems, Prentice 
Hall, Englewood Cliffs, New Jersey, 1991. 

5. Gray, J. "Transparency in Its Place — 
The Case Against Transparent Access to 
Geographically Distributed Data," Tech. 
Report TR89.1, Tandem Computers, Cuper¬ 
tino, California, 1989. 

M. Tamer Ozsu is an associate professor 
in the Department of Computing Science 
at the University of Alberta (Edmonton, 
Canada), where he leads a research 
group that investigates distributed data¬ 
bases, object-oriented databases, and 
database operating system design is¬ 
sues. He is the author or coauthor of 12 
books as well as a number of technical 
papers dealing with database technology. 

Patrick Valduriez is a director of research 
at INRIA, the national research center for 
computer science in France. He is cur¬ 
rently heading a project on advanced 
database technology, including rule- 
based and object-oriented databases and 
parallelism. He is the author or coauthor 
of over 50 technical papers and several 
books on database systems. 

To contact the authors, write to M. Tamer 
Ozsu, Dept, of Computing Science, 
University of Alberta, Edmonton, Canada 
T6G 2H1. His e-mail address is 
ozsu@cs.ualberta.ca. 


EasyCASE Plus 


The affordable approach to software engineering... omy $495 


F inally, there’s a CASE tool that won’t 
get in the way of your creativity... A 
tool that makes structured analysis, 
structured design and data modeling 
as easy as working with any other tool 
on your PC - EasyCASE Plus! Using 
EasyCASE Plus’ new, easy to use 
graphical user interface (GUI), you’ll 
be creating and editing charts, linking 
them, and building your data dictionary 
in no time. As well as being easy to 
use and easy to learn, EasyCASE 
Plus is easy on your budget! Ask any 
user. They’ll tell you it’s the best buy 
for your PC based CASE tool needs. 
Discover why over 4,000 software 
professionals use EasyCASE Plus 
and how you can join them! 


Requirements: 

Runs on: IBM PC or PS/2 (AT recommended), 
DOS 3.1 or higher, EGA/VGA color, mouse, 640 K 
RAM (500 K free), 1 MB EMS recommended, 
math co-processor supported. Printers/Plotters 
Supported: Epson FX & LQ, IBM Graphics & 
Proprinter X24, HP QuietJet, DeskJet, & 

LaserJet, HP Plotters, PostScript. 

EasyCASE Professional.$649 

(includes integrated DFD level balancing and 
data dictionary/diagram analysis) 



“EasyCASE Plus is a well designed, low priced 
tool that is easy to learn andprovides excellent 
diagramming capabilities... EasyCASE Plus is 
an excellent investment. ” 

Methods: 

■ Yourdon/DeMarco 

■ Gane & Sarson 

■ Ward-Mellor/Hatley 

■ Yourdon/Constantine 

■ Martin 

■ Chen, Bachman 


COMPUTER 

LANGUAGE 


Diagram Types: 

■ Data Flow Diagrams (DFDs) 

■ Structure Charts 

■ State Transition Diagrams 

■ Entity Relationship (ERDs) 

■ Data Model Diagrams 

■ Transformation Schema 
(real-time DFDs) 



Features: 

■ IBM SAA/CUA compliant graphical user 
interface (GUI) 

■ Extensive diagram editing features 

■ Integrated dBASE III compatible data 
dictionary 

■ Integrated dictionary manager, reports 
manager, process editor 

■ Hierarchical chart linking & process 
decomposition 

■ Record and element definitions 

■ Extensive printer, plotter and desktop 
publishing support 

■ Data dictionary import, export, and merge 

■ On-line help 

■ Comprehensive documentation with tutorial 

■ Access to your database, word processor, 
DOS, etc. 

■ Integrated diagram analysis (optional) 


.Evergreen 16650 ne 79 th street 
riA 0T71 Suite 200 
lj/lljlj Redmond,WA98052 

TOOL FAX: (206) 883-7676 

Call today for a brochure! 

Tel: (206) 881-5149 


©1991 by Evergreen CASE Tools, Inc., All Rights Reserved. All trademarks are the property of their respective companies. 


CIRCLE 25 ON READER SERVICE CARD 

MARCH 1992 

52 










































































INTERNATIONAL D B 2 USERS GROUP 
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International Conference 

"Target the Future — Managing Your DB2 Environment" 
May 10-14,1992 
New York Hilton & Towers 


IDUG Conference Brings Unduplicated 
Education and Networking Experiences 
to All Levels of DB2 Users 

Beginning May 10 through May 14, 1992, the 
International DB2 Users Group (IDUG) builds 
on the experience and strengths of its three 
previous conferences to bring knowledge¬ 
able relational technology industry sources 
and technical education opportunities to the 
New York Hilton & Towers in New York City. 
The event is IDUG's 4th Annual International 
Conference, “Target the Future — Managing 
Your DB2 Environment/" The value is an 
unprecedented selection of quality training 
sessions critical to productive and effective 
job performance. 



participants with unmatched take-home value. 
Peer networking opportunities — among a 
strong international presence — are limitless. 
Just as important, the cost to attend is 
minimal; $795 (before March 20) purchases a 
full conference registration, including meals. 

This year's keynote speakers include: 

Jeff Tash, founder and president of 
Database Decisions, presenting "DB2 as a 
Foundation for IBM's Frameworks"; 

Earl Wheeler, senior vice president and 
general manager of Programming Systems 
for IBM, discussing IBM's newest composite 
application, "The Information Warehouse 
Framework"; 


Bringing together more than 1,800 DB2 users and 
suppliers from around the world, this major industry 
event features expert speakers and technical sessions 
geared to nine key levels of relational technology 
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Executive Development. They are: 
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New DB2 Users 
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speakers considered to be the best of the industry; an 
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favorite speakers; and, for the first time, “Exhibit 
Only" passes. 

What Makes IDUG Conferences Worth the 
Investment? 

IDUG is an independent, non-profit, volunteer-driven 
organization consisting of user, associate and vendor 
members. IDUG conferences stress objective, fresh 
presentations geared strictly to users, while vendor 
and IBM participation bring the conference full circle. 
Meetings of special interest groups, highly informative 
panel presentations, and an impressive display of new 
products and technology provide conference 


Colin White, founder and president of Database 
Associates International, presenting new information 
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Gabrielle Wiorkowski, founder 
and president of Gabrielle & 

Associates, presenting "Index 
Design, Joins, and Subselect 
Performance ." 
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Windows Client/server workshop 

Test Driving Tools for the New 
Application Development Environments 

Windows, Presentation Manager, Apple, Motif and OpenLook 

Boston ■ April 28-30, 1992 


How do you build client/server applications that run in a windows environment? 
Are you in a quandary, unsure of what to do, what to buy, or who to buyfrom? 


Added Educational Value 

The Windows Client/server Workshop also offers you the chance to hear 
leading consultants and industry gurus discuss the major issues surrounding 
windows client/server technology. 


To help you compare and contrast the 
full power of various windows pro¬ 
ducts’ features and benefits, each 
vendor at The Windows Client/Server 
Workshop has 75 minutes to walk 
you through an entire application 
development effort, from soup- 
to-nuts. 

Windows tools must be seen in 
action because these products defy 
simple comparisons. Many are 
radically different, particularly in 
their approach to application 
development, that’s why live demos 
are mandatory. 

To assist you in your windows tools 
selection process, The Windows 
Client/Server Workshop ends each 
day with hospitality suites where you 
can “roll-up-your-sleeves” and 
“test drive” the previously demo’d 
products. 

A partial listing of vendors 
who will showcase their 
windows tools: 

■ Cognos 

■ Cooperative Solutions 

■ Matesys Corporation 

■ Micro Data Base Systems, Inc* 

■ ONTOS, Inc* 

■ ParcPlace Systems, Inc. 

■ Powersoft* 


Speaker Presentations 

■ Frameworks for Client/Server 
Computing by Conference 
Chairman Jeff Thsh, Database 
Decisions 

■ LAN Survival Guide by 

Larry DeBoever, Tucker/DeBoever 
Tfechnologies 

■ Integrating Client/Server and 
CASE by Pieter Mimno, 
Ttechnology Insight, Inc. 

Product Comparisons: 
Strengths, Weaknesses and 
Commentary 

by Conference Chairman Jeff 
Tash, Database Decisions 


Tfechnical Presentations 

■ Effective GUI Design by 
Christine Comaford, Corporate 
Computing, Inc. 

■ Dynamic Data Exchange (DDE) 
and Object Linking & 
Embedding (OLE) by Kim 
Crouse, Synaptix 

Beyond CUA, How to Set 
Successful GUI Design 
Standards by Christine 
Comaford, Corporate Computing, 
Inc. 

■ Windows Networking: Using 
IPCs & RPCs by Greg Denenfeld, 
Denenfeld Systems Design 


Hot New Products and Companies to Watch 

This panel discussion, moderated by Fred Langa, Windows Magazine, brings 
together a who’s who of distinguished computer industry press members who 
will share their opinions and ideas about the latest and greatest new products 
and companies on the windows scene. Panel includes representatives from PC 
Week, Computerworld, Info World and Byte. 

Call today to register or receive more information on 
this exciting new workshop (508) 470-3880. 


■ Revelation Technologies, Inc* 


Sponsored by 


Shouldn’t You Be Participating Too? 
Call Kathleen Spencer at 
(508) 470-3870 for More 
Information. 

* Conference Co-Sponsor 
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BY DEBORAH L. BROOKS 


Primary keys—the building blocks of a stable database—are often 
compromised to fit real-world situations. Here's a technique to provide 
controlled flexibility 


S 


URROGATE KEYS 
(artificial, meaning¬ 
less arbitrary identifiers) are often 
proposed as alternatives when prob¬ 
lems with primary keys occur. But 
what factors must data analysts 
and database designers take into 
consideration to determine if sur¬ 
rogate keys are best suited to take 
the role of primary identifier? 

First, does an attribute or set 
of attributes that has all the re¬ 
quired properties of a primary key 
exist? A primary key should be 
unique to differentiate one occur¬ 
rence of an entity from another. A 
primary key should also be explic¬ 
it, in that a value exists for every 
entity occurrence known upon its 
creation; definite, so its initial val¬ 
ue is not null; consistent, so each 
occurrence is associated with only 
one value of its identifier and the 
value the same wherever that oc¬ 
currence is referenced; and stable, 
so its value never changes or be¬ 
comes null as long as that occur¬ 
rence exists. A primary key should 
also be factless so it identifies but 
does not describe the entity occur¬ 
rence. A factless identifier increases 
the likelihood that the primary 
key will be unique, explicit, defi¬ 
nite, consistent, and stable. 

In addition to these funda¬ 
mental properties of primary keys, 
the attributes are ideally accessible 
and understandable to users and 
conform to any DBMS or installa¬ 
tion-specific limitations. And when¬ 
ever possible, a primary key should 
also be controllable so a suitable 
identifier can be assigned. 

PRIMARY KEY PROBLEMS 

Given this impressive list of quali¬ 
fications for attributes aspiring to 
be primary keys, no wonder prob¬ 
lems occur in the real world. The 
first problem is a primary key 
whose value is subject to change, 
could become null, or is comprised 


How and 
When to Use 
Surrogate Keys 


of attributes that could become 
nonkey, which is known as a vola¬ 
tile primary key. If the primary 
key can change, the uniqueness of 
its value may be compromised. In 
addition, if the primary key can 
become null, the properties of 
consistency and explicitness are 
also in jeopardy since a null does 
not signify a valid value. 

For primary key attributes that 
become nonkey, the stability, con¬ 
sistency, and explicitness of the pri¬ 
mary key is affected. The data ad¬ 
ministrator must also be concerned 
with updating the logical data mod¬ 
el and data dictionary entries to re¬ 
flect the removal of this attribute 
from the primary key. 

A volatile primary key can 
create problems with maintaining 
referential integrity as well as per¬ 
forming joins and other operations 
(such as attribute concatenation), 
which are impacted by nulls. And 
for ordering data, nulls will either 
sort to the end or beginning of the 
output list, depending on the cho¬ 
sen relational DBMS. 

The second problem concerns 
the concatenation of multiple attri¬ 
butes into one denormalized pri¬ 
mary key attribute. Because multi¬ 
ple attributes are combined into 
one, the business meaning of data 
is not clear, which could hamper 
user accessibility. If any part of the 
primary key attribute has the po¬ 
tential to change or become null, it 
is more difficult to deal with than 
if the attributes had remained sep¬ 
arate. A denormalized primary key 
also presents problems with joins, 
range-checking, sorting, ordering. 


grouping, index definition, appli¬ 
cation performance, referential in¬ 
tegrity, and domain enforcement. 
Because the attributes must always 
be treated as a unit, some opera¬ 
tions are precluded and others ac¬ 
complished only through the use 
of inefficient operators, such as 
substringing. 

The third problem is an all¬ 
purpose primary key. In this case, 
either one attribute or another can 
function as the primary identifier 
because values never occur simul¬ 
taneously in both. Both attributes 
are layered on top of each other to 
create a primary key that does not 
have a null component. The all¬ 
purpose primary key is created by 
taking the most forgiving data 
type (usually character) and long¬ 
est length of the two attributes, 
calling it some combination of their 
names, and populating it with 
whichever value is present in each 
entity occurrence. 

An all-purpose primary key 
is, in effect, a form of a denorma¬ 
lized primary key since the prima¬ 
ry key attribute is really comprised 
of two attributes. Instead of denor¬ 
malization through the concatena¬ 
tion of attributes, the attributes are 
superimposed upon one another. 
This approach could compromise 
primary key uniqueness if the 
range of values for the two attri¬ 
butes overlaps, and if it is difficult 
for the user to understand the at¬ 
tribute's business meaning since it 
means different things in different 
circumstances. Overlapping logical 
domains can also complicate range¬ 
checking and ordering or group- 
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ing if data keyed by one attribute 
is to be separated from data keyed 
by the second attribute. 

A primary key that is not 
universally agreed upon poses the 
last problem. This situation is called 
a nonuniversal or parochial prima¬ 
ry key and occurs when the entity 
life-cycle phases fall in different 
functional organizations. For ex¬ 
ample, the planning department 
may identify a product by a differ¬ 
ent set of attributes or at a different 
level than the manufacturing de¬ 
partment, and so on through the 
warehousing, marketing, and sales 
departments. Even if an attribute of 
the same name is used, such as 
product number, it could have dif¬ 
ferent logical and physical domains. 

For this situation to occur, 
data models for each functional area 
would have had to be developed in¬ 
dependently. Enforcing uniqueness 
across functional areas is an impos¬ 
sibility when the same primary key 
is not used by all areas. User accessi¬ 
bility and understandability of the 
data across departments is similarly 
impossible, prohibiting any mean¬ 
ingful cross-departmental or histori¬ 
cal tracking. Nor can the primary 
key be considered stable when it 
can take on different department- 
specific physical representations, 
values, and meanings. 

CONSIDERATIONS 

While another attribute or group 
of attributes that better fulfills the 
properties of a primary key (known 
as an alternate key) may exist, a 
surrogate key should always be con¬ 
sidered. Because of the rapidly 
changing nature of business, a pos¬ 
sibility always exists that any identi¬ 
fier that is not factless could become 
volatile. But before you decide to 
use a surrogate key, you must weigh 
a number of other factors. 

The length of the primary 
key should be considered. Long 
primary keys have implications for 
DASD consumption to store the 
indexes supporting the primary 
key and its occurrence in other ta¬ 
bles as a foreign key. A long pri¬ 
mary key also impacts the process¬ 
ing time required for such utilities 
as load, backup, and recovery, and 
such operations as joins, scans, in¬ 
serts, and deletes. A surrogate key 
has the advantage of being rela¬ 
tively short in comparison. 


The length of the 
ppimary key 
should be 
considered 

If the primary key is a com¬ 
posite or multicolumn key, all col¬ 
umns of the primary key must be 
specified to retrieve a row or per¬ 
form a join. Creating views to de¬ 
fine selection or join criteria or 
create synonyms for the concat¬ 
enation of the columns can allevi¬ 
ate this problem. All columns are 
also needed to maintain referential 
integrity, which could prove oner¬ 
ous. A multicolumn primary key 
also may be very long, with all the 
attendant concerns thereof. A sur¬ 
rogate key is almost always a sin¬ 
gle attribute, which simplifies re¬ 
ferential integrity and the coding 
of SQL statements. 

Index limitations posed by the 
chosen DBMS must also be consid¬ 
ered. An index limitation on an in¬ 
dex's length or the number of col¬ 
umns it could reference could 
preclude the use of an index to en¬ 
force uniqueness. In this case, you 
can enforce uniqueness through ap¬ 
plication logic, so a surrogate key 
may provide a good alternative. 

The existence of any of these 
factors will tip the scales in favor 
of using surrogate keys. The fol¬ 
lowing considerations may weight 
the decision in the other direction. 

Installation-specific limitations 
can play a role in the decision to 
use surrogate keys, especially when 
coupled with a requirement for 
user access by the original primary 
key rather than the surrogate key. 
The DBA may place a limit on the 
number of indexes that can be de¬ 
fined on a table. As more indexes 
are defined on a table, the amount 
of required DASD increases, as does 
the probability of locking prob¬ 
lems; the processing time required 
for inserts, updates, reorganiza¬ 
tions, recovery, backup, load, and 
statistics gathering; the object ad¬ 
ministration to create the indexes; 
the probability of reaching the 
limit on the number of open data¬ 
sets; and the complexity of table or 
partition recovery. If users require 
access by the original primary key 


and an installation limitation on 
the number of indexes allowed is 
present, using a surrogate key may 
not be feasible. 

The primary determinant in 
using a surrogate key is which or¬ 
ganization owns the business in¬ 
formation and assigns the primary 
key. If the primary key is assigned 
externally, a surrogate key prob¬ 
ably will not be the appropriate 
choice due to the inability to iden¬ 
tify the same entity occurrence 
across organizational boundaries 
consistently. If the primary key as¬ 
signment is internally controlled, 
the decision to use surrogate keys 
is entirely up to the organization's 
database designers and is based on 
a thorough analysis of the busi¬ 
ness information requirements. 

However, an externally as¬ 
signed primary key is not control¬ 
lable, probably not factless, and 
more prone to volatility and other 
primary key problems. If this is 
the case, then a surrogate key may 
be used internally but the original 
primary key attributes must be 
maintained for providing cross- 
organizational tracking. 

Another exception to using 
surrogate keys for externally con¬ 
trolled data is when two self- 
contained systems with different 
primary keys are integrated and 
surrogate keys are already being 
used by the target system. You 
should create a cross-reference of 
the original primary key to the 
new surrogate primary key to pro¬ 
vide a bridge between systems as 
long as required. If the primary 
key assignment is internally con¬ 
trolled, the decision to use surro¬ 
gate keys is entirely up to the or¬ 
ganization's database designers. 

If a surrogate key is assigned 
as the primary key, the values 
must be made known to the users 
immediately. The original primary 
key attributes should be included 
as columns in the table and an in¬ 
dex should be defined on them if 
user access warrants it and instal¬ 
lation limitations allow. 

Once you make the decision 
to use surrogate keys, implications 
for different types of indexes exist. 
Using a random assignment algo¬ 
rithm for a surrogate key risks the 
minute possibility of duplicate val¬ 
ues. This risk can be controlled 
through application logic to gen- 
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erate another value should a du¬ 
plicate be encountered. A surro¬ 
gate key functioning as a primary 
key is not a good basis for a clus¬ 
tered index since the values have 
no inherent business meaning. 
While a unique index can be de¬ 
fined on the surrogate key, a clus¬ 
tering index on the original pri¬ 
mary key or another attribute may 
better support access requirements. 
In tables where the surrogate key 
functions as a foreign key, the 
definition of a clustering index on 
the surrogate key may be appro¬ 
priate to access the data as re¬ 
quired. An example of such a clus¬ 
tering index would be on the 
surrogate key customer number in 
a customer invoice table. 

When partitioning is re¬ 
quired, serial assignment of a sur¬ 
rogate will result in a skewed dis¬ 
tribution of rows across partitions. 
The first partition would be com¬ 
pletely filled before any rows are 
inserted into the second partition. 
Because of the amount of activity 
occurring on one partition, lock¬ 
ing problems may ensue. 

An alternative to randomly 
assigned surrogate keys for resolv¬ 
ing this problem is a modified ser¬ 
ial assignment algorithm. Instead 
of vertically filling each partition 
before going on to the next, the 
surrogate key values can be as¬ 
signed to insert one row into each 
partition before inserting a sec¬ 
ond. This horizontal assignment 
ensures a uniform distribution of 
rows across partitions and elimi¬ 
nates potential locking problems. 

ALTERNATIVES 

If primary key problems prohibit 
the use of the original primary key 
and if the surrogate key is not a 
viable alternative, other options 
exist. A timestamp can be used as 
the primary key or a primary key 
component. Except in the case of 
batched arrival, uniqueness would 
be ensured. A timestamp could 
also be meaningful if arrival se¬ 
quence has importance to the enti¬ 
ty in question. 

A timestamp, however, can be 
quite long (26 bytes in DB2), which 
dramatically increases the size of 
the primary key and raises the is¬ 
sue of all the DASD, performance, 
and index limitation considerations 
included in the discussion on sur- 


A final 

alternative to 
surrogate keys is 
the substitute key 

rogate keys. In addition, the time- 
stamp represents adding another 
column to the primary key—so the 
multicolumn primary key consid¬ 
erations apply as well. And it does 
not resolve the problems associat¬ 
ed with externally originating data 
since the only controllable part of 
the primary key is the timestamp. 
Also, clustering and partitioning 
indexes should not be defined on 
a primary key that consists of a 
timestamp alone because locking 
problems could ensue with a large 
volume of inserts. 

While a surrogate key may 
not work as the primary key, it 
may be suitable as a primary key 
component. If the primary key at¬ 
tributes do not ensure uniqueness, 
adding a surrogate key in the form 
of a sequential counter will. This 
addition is commonly made to iden¬ 
tify invoice or order line items. 

Adding a sequential counter 
does not increase the primary key's 
overall length by much, but it may 
raise the issues of DASD consump¬ 
tion, performance degradation, and 
installation-specific index limitations 
I discussed earlier. If the data origi¬ 
nates in an external organization, 
the only controllable part of the pri¬ 
mary key is the sequential counter. 

A final alternative to surro¬ 
gate keys is the substitute key. The 
value of the primary key or any of 
its components is abbreviated or 
encoded to reduce its length. All 
substitute values are predetermined 
to represent a specific business 
meaning prior to their assignment 
as a primary key. This approach is 
unlike the surrogate key, which has 
no inherent business meaning and 
whose value is determined at time 
of assignment. The abbreviations 
for airline carriers and airports are 
examples of substitute keys. 

Substitute keys can substantial¬ 
ly reduce the primary key length, 
so many of the length-related con¬ 
siderations are no longer an issue. 
However, controllability will still 
be an issue with externally origi¬ 


nating data. Only the assignment 
of the substitute key is controlla¬ 
ble. Using substitute keys with ex¬ 
ternal data may not be possible if 
the external data is volatile. The 
designation of the valid substitute 
keys cannot be made on the fly. It 
must be made in advance of the as¬ 
signment of the substitute key to 
entity occurrences. 

An alternative for addressing 
denormalized and all-purpose pri¬ 
mary keys is the controlled intro¬ 
duction of data redundancy. If some 
applications use denormalized or 
all-purpose keys and others do not, 
the original application with pri¬ 
mary key problems can introduce 
redundancy by including the com¬ 
ponent attributes as nonkey data 
elements in the table using the 
group or layered attribute as the 
primary key. Additional indexes to 
support joins and access require¬ 
ments would have to be defined, 
installation limitations permitting. 
This approach would avoid the 
complicated application program¬ 
ming required by using substring¬ 
ing and concatenation. 

Doing nothing is also an op¬ 
tion. To an extent, the program¬ 
ming complexities can be mitigat¬ 
ed through the creation of views 
for joins and selections requiring 
the use of string operators. How¬ 
ever, the performance problems 
and inability to perform operations 
such as joins and attribute concat¬ 
enation will still exist. 

TRICKS OF THE TRADE 

Surrogate keys are not the "silver 
bullet" for primary key problems. 
But under the right set of circum¬ 
stances, surrogate keys can meet 
the requirements of a primary iden¬ 
tifier without creating additional 
problems for users and database 
designers. The real trick is to know 
when they are appropriate and 
when some other alternative would 
be a better fit. 1111 
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BY STEVEN CANIANO 


The concluding part of our DBMS selection criteria will help you decide 
if a product can handle systems, architectural, and operational issues 


Navigating 
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YSTEM INTERNALS 
and architectural is¬ 
sues cover items especially critical 
to those who must design an appli¬ 
cation architecture or work with the 
DBMS engine's internal workings. 
Some of the categories covered in 
the checklist over the past two 
months have addressed very specif¬ 
ic areas of functionality. However, 
this month. I'll address the prod¬ 
uct's major design components and 
then focus within those compo¬ 
nents. Since these areas compose 
the heart of a DBMS system, the 
design decisions a vendor makes 
often affect the direction and com¬ 
position of a product and product 
family for many years to come. 

SECURITY 

In the broadest sense, you can 
think of security as the ability to 
control access to a database and its 
data. When examined closely, se¬ 
curity can either be present or not 
present in many areas, and each 
one can take on multitudes of im¬ 
plementations. Almost without ex¬ 
ception, the concept of security is 
of utmost importance to most ap¬ 
plications. Therefore, you must 
understand the levels of security a 
product supports and how they 
are supported. To understand this 
category, examine these areas: 

□ Does the DBMS depend on 
operating system security only or 
does it use its own security system 
on top of the operating system? 

□ Does the DBMS support the 
concept of DBMS defined users? If 
so, what level of user and user 
group administration is necessary? 
How are remote users incorporat¬ 
ed into the security system? 


the DBMS 

Labyrinth 


□ How is database and table 
ownership determined? 

□ Does the database provide 
the ability to control entrance into 
the database? Can entrance be con¬ 
trolled for a single user, group of 
users, or all users? 

□ Can you control who cre¬ 
ates tables and indexes and alters 
system catalogs? Are grantable per¬ 
missions supported on the data- 
definition language? 

□ What is the create, re¬ 
trieve, update, delete (CRUD) lev¬ 
el security for tables? Are granta¬ 
ble privileges available to create, 
retrieve, update, and delete rec¬ 
ords from tables? Are they sup¬ 
ported at the single user, group, or 
public level? 

□ Can you establish permis¬ 
sions within a table; for example, 
permission on a row, column, field, 
and so on? 

□ Are permissions flexible 
enough to support time of day and 
day of week sensitivity (for ex¬ 
ample, providing a user with ac¬ 
cess to data from 9 a.m. to 5 p.m., 
Monday through Friday only)? 

□ Do permissions extend to 
views? Are they separately definable? 


□ Does security extend to 
stored procedures and rules? 

□ Is the concept of a private 
database, which is accessible only 
by its creator, supported? 

□ What type of security is 
available on database journal and 
log files? 

□ How is transaction data, 
which is sent over a network, 
secured? 

Obviously, different products 
implement many different securi¬ 
ty features. Some of these features 
may be useless to some applica¬ 
tions and indispensable to others. 
When evaluating a DBMS, you must 
determine your security needs be¬ 
fore investigating a product's 
functionality in this area. Failure 
to do so will often result in a key 
requirement being overlooked un¬ 
til it is too late to implement it 
during development. 

CONCURRENCY CONTROL 

Concurrency control is the DBMS's 
ability to manage the isolation and 
locking of all or parts of a data¬ 
base. This function occurs in a mul¬ 
tiuser environment when multiple 
users attempt to update a database 
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at the same time. In these cases, the 
DBMS must assure that the database 
remains in a consistent state through¬ 
out all operations. This aspect of a 
DBMS may seem too low level to 
be very important to customers; 
however, a DBMS will often pro¬ 
vide various levels of concurrency 
management that provide custom¬ 
ers with numerous design and de¬ 
velopment options. Some areas to 
focus on include: 

□ How does the DBMS con¬ 
trol database locking? 

□ What types of locks are 
supported (shared, exclusive, in¬ 
tent to update, and/or demand)? 

□ Is lock escalation support¬ 
ed? (For example, if you lock a 
great number of individual data 
pages in a table, it is often more ef¬ 
ficient to lock the entire table in 
lieu of pages.) Is this feature auto¬ 
matic or can it be controlled? 

□ Does the DBMS support 
promotable locks? (For example, if 
you already hold a shared lock on 
a record, can it be promoted to an 
exclusive lock without releasing 
the shared lock?) 

□ Are time-outs for locks 
available? How are deadlocks de¬ 
tected and handled? 

□ What is the default locking? 

□ How does locking work in 
a distributed environment (for ex¬ 
ample, locking among processors)? 

□ What level of locking gran¬ 
ularity (page, row, table, database, 
and so on) is supported? 

□ Can you tune the locking 
system? 

The amount of flexibility in 
the concurrency control system of¬ 
ten relates directly to the level of 
performance that can be attained. 
Therefore, it is an area you should 
examine in some detail, particular¬ 
ly if you are familiar with the na¬ 
ture of an application's transac¬ 
tions and their interaction. 

QUERY OPTIMIZER 

A query optimizer is perhaps the 
one aspect of a relational DBMS 
that contributes most to the sys¬ 
tem's performance. You can think 
of the optimizer as built-in intelli¬ 
gence that determines the best ac¬ 
cess path to the requested data. 
This process may sound rather 
simple but, on the contrary, is very 
complex because determining the 
best path to data is not only diffi- 

































cult but may mean different things 
to different people. An intelligent 
query optimizer can often remove 
a burden from a developer's shoul¬ 
ders and compensate for the lack 
of programmer expertise. Never 
ignore the query optimizer in a 
DBMS evaluation. Some specific 
features to analyze include: 

□ How does the optimizer 
determine the best path to the 
data? Does it use statistics the data¬ 
base keeps about the data, deter¬ 
mine the least-cost method, re¬ 
quire the programmer to specify 
the access path by the query's or¬ 
der, or determine an access path 
based on predetermined rules? 

□ Can a DBA specify data stat¬ 
istics or update the system's statis¬ 
tics? Does the optimizer automati¬ 
cally update statistics, or can a user 
access and manually update these 
statistics? Can users request an up¬ 
date, or must they update manually? 

□ Can the optimizer account 
for differences in data distribution 
(skewed data patterns, key uni¬ 
formity, and so on) in a table? 

□ What does the optimizer 
take into account (for example, 
numbers of rows in tables, index¬ 
es, unique key values, estimated 
CPU and I/O, estimated data trans¬ 
fer to a networked environment) to 
determine the access path? 

□ Can the optimizer estimate 
query costs? How does it define 
them? 

□ Is there a query explain fa¬ 
cility that supplies the optimizer's 
access choices prior to the query's 
execution? 

□ Can you override an opti¬ 
mizer's access choice? 

In general, the more intelli¬ 
gent the optimizer, the less knowl¬ 
edgeable the user must be. Of 
course, intelligence comes at a price, 
which, in many cases, is the system 
overhead for determining the access 
path. For simple queries, it may take 
more time to determine an access 
path than to access the data! Of 
course, a good optimizer more 
than makes up for this trade-off in 
the complex query case. Usually, 
an item such as a query explain fa¬ 
cility is an invaluable tool to any 
development shop. 

OBJECT MANAGEMENT 

Many think that, after relational, 
object-oriented database represents 


Does the DBMS 
support user- 
defined 
functions? 

the next major database paradigm. 
However, it remains to be seen 
whether an entire new breed of 
products will displace today's popu¬ 
lar choices. Nevertheless, to remain 
state-of-the-art, today's market lead¬ 
ers will have to incorporate some 
degree of object-oriented features 
and knowledge management into 
their products. In some cases, a ven¬ 
dor may have already expanded a 
product along these lines; in others, 
plans may be on the drawing board. 
To identify these differences, you 
should examine the following: 

□ Can the concept of inheri¬ 
tance be implemented among ob¬ 
jects? Are object types and subtypes 
allowed? 

□ Does the DBMS support 
user-defined functions? Are they 
enforced at the DBMS level? 

□ Does the DBMS support 
recursive and nested functions? 

□ Can large nontraditional 
objects (image, voice pattern, and 
so on) be stored in a database file? 
If so, how are they manipulated? 

□ Are standard DBMS securi¬ 
ty features in use for these large 
objects? 

□ What storage medium is 
used to house large binary data? 

□ Can the DBMS tool sets 
display such images? 

Although these features are 
somewhat advanced at present, 
they will become even more im¬ 
portant in the future. If you're cur¬ 
rently considering a DBMS prod¬ 
uct, you should understand what 
the vendor's plans are in these 
areas to avoid being behind the 
technological times in the future. 

DISTRIBUTED PROCESSING 

From a database perspective, you 
can think of distributed processing 
as the ability to make a request 
from data on a processor that is re¬ 
mote to the requestor. A subset of 
distributed processing, distributed 
database expands the concept by 
permitting data residing on sepa¬ 
rate processors (or at least in sepa¬ 


rate databases) to be viewed and 
accessed as if it were a single data¬ 
base. These architectural issues are 
among the most critical to many 
customers today, and it is there¬ 
fore crucial to understand the im¬ 
plementation of distributed fea¬ 
tures by the major vendors. The 
following questions raise some im¬ 
portant considerations: 

□ Does the product provide 
homogeneous read capability (the 
ability to read remotely from the 
same DBMS product in a multi¬ 
processor, networked environment)? 
Must both DBMS products be at 
the same release level? 

□ What is the connecting soft¬ 
ware and underlying networking? 

□ Does the product support 
downloads or "snapshots" of re¬ 
mote data only or are multisite 
joins supported? 

□ How does the DBMS de¬ 
termine where the data resides? In 
the case of a multisite table join, 
how does it determine the optimal 
location (the most efficient man¬ 
ner in terms of processing and net¬ 
work traffic) to perform the join? 

□ Does the product maintain 
a "global data dictionary" or glo¬ 
bal catalogs? Is it stored on one or 
many sites? (If the product is stored 
on one site, it runs the risk of a sin¬ 
gle point of failure that can stall all 
distributed processing. If it is stored 
on multiple sites, the product must 
keep all the sites in sync with one 
another as the database changes.) 

□ How can a distributed data¬ 
base be partitioned? (Some possi¬ 
bilities include by table, by columns 
within tables, by keys within tables 
and so on [for example, horizontal 
versus vertical partitioning].) 

□ Does the product support 
remote access for heterogeneous 
products? If so, which products 
does it interface with, and what are 
the software and networking com¬ 
ponents? What limitations does the 
product impose on access? Can it in¬ 
terface to nonrelational products? 

□ Does the product support 
homogeneous write capability (the 
ability to update a remote database 
in a multiprocessor, networked 
environment)? 

□ Does it permit single-site 
updates only or does it support a 
two-phased commit protocol that 
permits simultaneous multisite 
updates? 
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□ Does the product support 
heterogeneous write (remote up¬ 
dates of foreign DBMS products)? 
If so, can it act as the coordinator 
of such a transaction? Will it func¬ 
tion as a participant to another 
coordinator? 

□ How are deadlocks han¬ 
dled in the distributed case (for 
homogeneous and heterogeneous 
products)? 

□ Can the product participate 
in two-phased commits driven by a 
transaction processing monitor? 

□ Which processors, networks, 
gateways, and product versions are 
supported by this product in its 
distributed architecture? 

□ Does the product support 
the ISO RDA standard? 

The number of questions on 
this list gives you a fair impression 
of the complexities raised by dis¬ 
tributed computing and distribut¬ 
ed database. Before approaching 
vendors on this issue, you must 
understand your business comput¬ 
ing needs. In many cases, a distrib¬ 
uted solution is not the proper 
business solution. Nevertheless, 
with more and more powerful 
desktop resources, distributed pro¬ 
cessing should always be a consid¬ 
eration. Therefore, you must be 
aware of the vast array of imple¬ 
mentations and their implications. 

NETWORK COMPATIBILITY 

Tied very closely to distributed 
processing are underlying net¬ 
work compatibility issues. DBMS 
vendors must typically make many 
choices as to their preferred archi¬ 
tectural offerings in terms of pro¬ 
cessors and operating systems sup¬ 
ported and the various networking 
components that tie the systems 
together. Needless to say, it is 
mandatory that the network archi¬ 
tectures entrenched in a corpora¬ 
tion are among the supported ven¬ 
dor offerings. Some key network 
issues may include: 

□ Can the product run in a 
Datakit environment? STARLAN? 
FDDI? XNA? StarGroup? 

□ Does it provide connecti¬ 
vity over a TCP-IP gateway? TCP- 
IP over X.25? LU6.2? 

□ Can it provide connectivity 
directly over X.25? 

□ Can it run over Ethernet? 
OS/2 LAN Manager? LAN Mana- 
ger/X? How does it connect (Net- 


Sync replication 
is a specialized 
architectural 
need 

Bios, Named Pipes, and so on)? 

□ What is the vendor's com¬ 
mitment to OSI standards? 

In this area, you should be 
familiar with the key networking 
requirements for your particular 
environment. For example, if you 
are in the market for a LAN Server 
DBMS, you may not care much 
about the interface from UNIX to 
MVS. However, you would need 
to evaluate the implementation of 
the networking from client to serv¬ 
er over the LAN. 

SERVER COMPATIBILITY 

If you are interested in client/ 
server computing and, therefore, 
positioning the database as a serv¬ 
er, it becomes important to under¬ 
stand how well the DBMS can 
function in the server role. This 
role is one in which the database 
resides on one processor and is 
available to serve the data needs of 
many remote clients. To perform 
this role effectively, the vendor 
must address the following: 

□ How effective is its mem¬ 
ory utilization on the client and 
server processors? 

□ What is the interface to the 
network? Does the product attempt 
to minimize traffic? 

□ Does the DBMS support 
stored procedures and triggers? 
(By grouping requests into callable 
procedures and letting them trig¬ 
ger other procedures, the server 
can perform much work indepen¬ 
dent of the client, thus reducing 
network traffic and overhead.) 

□ Can the product interoper¬ 
ate with a large variety of clients? 
(This issue is especially pertinent 
for a LAN server that may need to 
work with third-party, shrink- 
wrapped software.) 

□ How accessible is the serv¬ 
er? Which platforms can easily get 
to the data stored on the server, 
using which mechanisms? 

Although many traditional 
DBMS issues are also issues when 
considering a DBMS as a server. 


the focus may shift to interopera¬ 
bility and accessibility and how 
well the product can function in 
the server role. A product that per¬ 
forms exceptionally well in a host- 
based environment may fail as a 
server if specific "server-type" fea¬ 
tures are not supported. 

SYNC REPLICATION 

Sync replication refers to the de¬ 
gree of fault tolerance in which an 
application can maintain an up-to- 
the-second, stand-by copy of a data¬ 
base. In the case of failure or cor¬ 
ruption of the primary database, 
the stand-by is simply brought in 
and the application continues pro¬ 
cessing as if nothing happened. Of 
course, this functionality is depen¬ 
dent on hardware and DBMS sup¬ 
port. From the DBMS standpoint, 
the issues include: 

□ Does the product support 
database shadowing or real-time 
replication? 

□ Is a dual-copy, mirror im¬ 
age maintained, or does the prod¬ 
uct rely totally on hardware fea¬ 
tures, such as disk mirroring? 

□ Can a dual copy be main¬ 
tained across multiple processors 
and networks or must they reside 
on the same machine? 

□ How is the primary copy 
brought back in sync with the 
stand-by copy after the primary 
has been brought back online? 

□ At what granularity (full 
database, tables, indexes, and so on) 
is sync replication supported? 

Sync replication is a special¬ 
ized architectural need. Many ap¬ 
plications do not need this level of 
resilience; however, for those that 
need it, the fault tolerance capa¬ 
bilities of the product become key. 

DEFERRED COPY 
MANAGEMENT 

Deferred copy management is a 
milder version of sync replication 
in which a stand-by copy of the 
database is required but need not 
be kept in sync up to the second. 
The stand-by copy lags behind the 
primary in terms of the data's cur¬ 
rency, possibly only coming into 
sync when all activity against the 
primary copy is completed. From a 
DBMS perspective, the issues in¬ 
volved in deferred copy manage¬ 
ment include: 

□ Which mechanism is used 
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to maintain the stand-by copy? 

□ How current is the data in 
the stand-by? Is this option tunable? 

□ How is the stand-by brought 
in sync with the primary copy upon 
loss of the primary? Are some trans¬ 
actions inevitably lost if the primary 
system or network crashes? 

□ Can the stand-by reside re¬ 
motely as well as locally? 

□ If the stand-by is con¬ 
structed by downloads of the pri¬ 
mary copy, can you ship incre¬ 
mental updates from the previous 
download or are only full down¬ 
loads supported? 

□ At what granularity (entire 
database, table, and so on) is de¬ 
ferred copy management supported? 

Once again, deferred copy 
management is a specialized archi¬ 
tectural need for some applications. 
The most critical aspects here in¬ 
clude the copy's currency and 
what occurs upon failure of the 
primary. In many cases, the appli¬ 
cation is willing to live with a 
slightly out-of-date version of the 
database. 

OPERATIONAL ISSUES 

Operational issues cover items es¬ 
pecially critical to those who sup¬ 
port or administer a DBMS prod¬ 
uct on a daily basis. Some of these 
issues refer to specific tools that 
are either necessary or would be 
helpful in systems administration. 
Others are more global issues that 
refer to the product's resilience 
and reliability and its effect on 
other system components. 

OPERATIONAL INTERFACE 

Critical to a systems administrator 
is the ability to know what's hap¬ 
pening inside the DBMS. Typical¬ 
ly, some sort of operational inter¬ 
face is provided as a "window" into 
the system. If no such tool is pro¬ 
vided, troubleshooting and system 
tuning can become much more 
difficult. Some items to examine 
include: 

□ Is there a tool (preferably a 
graphical interface) to view the 
running system? 

□ Can reports be generated 
that outline the key system param¬ 
eters and possible incorrect values? 

□ Are the operational tools 
aimed (optionally executed) at dif¬ 
ferent experience levels (for ex¬ 
ample, novice versus expert)? 


Performance is a 
key issue when 
choosing a DBMS 
product 

□ Is the concept of mapping 
functions to roles (for example, 
systems administrator versus DBA) 
in place? 

□ Do tools exist for start-up 
and shut-down of a DBMS or trans- 
action-based system? Is the con¬ 
cept of a partitioned start-up and 
shutdown in place? 

□ Which tools are provided 
for life-cycle administration (in¬ 
stallation, reinstallation, upgrades, 
and so on)? Are the operational 
tools included with the base prod¬ 
uct, purchased separately, or ob¬ 
tained through a third party? 

Obviously, a systems admin¬ 
istrator must regularly interact with 
the system's DBMS. This interac¬ 
tion should occur naturally. Does 
the administrator need to become 
an expert on the DBMS? What type 
of learning curve is required for 
new support staff personnel? Can 
administrators for one DBMS easi¬ 
ly port their skills to another 
DBMS? Many of these questions 
can be answered by the functiona¬ 
lity of the operational tools. 

SYNCHRONIZATION 

In many environments, it is desir¬ 
able (if not mandatory) to have 
compatible versions of a DBMS 
product across all operating envi¬ 
ronments. This synchronization is 
necessary in a distributed environ¬ 
ment, in which products must com¬ 
municate with one another. It may 
also be necessary from a support 
standpoint since it is desirable for 
one person to be able to support a 
DBMS product in all environments. 
This concept is referred to as cross¬ 
environment synchronization. Some 
issues to analyze include: 

□ Where do your processors 
and operating systems fall in the 
vendor's porting cycle? (If a large 
gap exists between ports for two of 
your environments, you may find 
yourself with an upgrade in only 
one environment for some length 
of time.) How well does the ven¬ 
dor administer the product's porta¬ 


bility across platforms? 

□ Does the product allow 
phased implementation and cut¬ 
overs (for example, upgrading one 
processor in a distributed environ¬ 
ment and another at a later time, 
rather than all at once)? 

□ Will tools from the prior 
release as well as third-party tools 
work with the DBMS engine's 
new release? (Again, this issue is 
critical in a phased cutover or a 
distributed environment.) 

□ Are multiple versions of 
the DBMS product supported un¬ 
der the same system image? (This 
feature may be useful during up¬ 
grades and beta tests.) 

Ideally, the vendor provides 
a great degree of flexibility in up¬ 
grade procedures. In many cases, a 
flash cutover in a distributed envi¬ 
ronment is simply not feasible. 
The releases should be upwardly 
compatible and complement one 
another. In some cases, however, 
you will find that elaborate data¬ 
base or applications conversions 
are required to move from release 
to release. This fact makes matters 
quite complex, particularly in a 
distributed environment. 

PERFORMANCE MONITOR 

Performance is often one of the 
most important aspects when se¬ 
lecting a DBMS product. Good per¬ 
formance is comprised of many 
components, ranging from good 
planning and design to efficient 
processing. You can think of a per¬ 
formance monitor as the DBMS's 
life support system, which moni¬ 
tors and verifies that the system is 
achieving and maintaining desired 
performance levels. If it is not, the 
performance monitor points out 
problem areas and suggests or im¬ 
plements improvements. Some sub¬ 
jects to address in a performance 
monitor include: 

□ Does the product provide 
performance trending reports (re¬ 
sponse and transaction times, TPS 
rates, and so on)? Are they pre¬ 
sented textually or graphically? 

□ Does the product indicate 
levels of CPU usage, DASD usage 
and accesses, and system idle time? 

□ Can the performance mon¬ 
itor isolate specific users and de¬ 
tect deadlock conditions? Can it 
detect who is waiting for record 
locks and who is holding them? 
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Can it determine runaway pro¬ 
cesses and terminate them? 

□ Does a trace facility exist 
that can report the number of 
times a transaction is processed, 
parsed, and executed? 

□ Does it provide heuristic 
guidelines for tuning and debug¬ 
ging aids? 

□ Does the performance mon¬ 
itor provide snapshot information 
and real-time information? 

□ Is it executed online or in 
batch? 

□ Which environment does 
the product run in (for example, 
host-resident versus PC-based)? Is 
it external or integrated into the 
engine? 

□ Is the performance moni¬ 
tor provided with the standard 
DBMS engine or is it sold sepa¬ 
rately? Is it or alternates provided 
by a third party? 

You will probably find wide 
variations in the performance moni¬ 
tors from different vendors. Some 
will claim that several rudimentary 
commands make up the monitor, 
others will provide elaborate prod¬ 
ucts, and other popular engines will 
offer third-party alternatives. Re¬ 
gardless, it is likely that a perfor¬ 
mance monitor of some kind will 
be a required component of any 
performance-oriented system. 

CAPACITY PLANNING TOOL 

Often, it is useful to test system 
workloads and performance char¬ 
acteristics prior to actual develop¬ 
ment. To do so usually requires 
some sort of capacity planning or 
applications-modeling tool. This 
tool will let a designer or adminis¬ 
trator build a conceptual model of 
the applications system and pro¬ 
vide its operational and perfor¬ 
mance characteristics and predic¬ 
tions as outputs. This type of tool 
can greatly reduce applications- 
development and design time. Some 
issues to examine include: 

□ Is such a capacity planning 
and modeling tool available with 
the product or from a third-party 
vendor? 

□ Can it predict the perfor¬ 
mance of utilities (backup, recov¬ 
ery, and so on) as well as transac¬ 
tion performance? 

Unfortunately, this type of 
tool is not usually provided by 
DBMS vendors. Some third-party 


Backup and 
recovery are 
crucial aspects 
of any system 

vendors specialize in this field; 
however, they typically provide 
such tools for only the most popu¬ 
lar DBMS engines. 

RELIABILITY 

This issue attempts to get at the 
product's production strength by 
relying on published statistics that 
the vendor should be able to pro¬ 
vide, as well as specific features re¬ 
garding fault tolerance and product 
reliability. Here, you must examine 
the vendor's ability to react and 
service problems in a crisis situa¬ 
tion. Pertinent issues include: 

□ What is the rate of outages 
this product requires? Is a period 
of downtime required each day or 
can the product support an around- 
the-clock operation? 

□ What are the demonstrated 
mean time to failure and mean 
time to repair statistics from the 
vendor's customer base? 

□ Does the product provide 
any fault-tolerant capabilities? 

□ What is the vendor's com¬ 
mitment (on-site personnel, time 
commitments, and so on) to repair 
the product in a crisis situation? 

In general, you must be con¬ 
fident that you are using a reliable 
product with a competent and com¬ 
mitted vendor behind it. A vendor 
with a good track record has prob¬ 
ably performed in mission-critical 
applications before. A vendor who 
cannot provide a track record may 
be just entering the market for 
such systems. 

RECOVERY TOOLS 

Database backup and recovery can 
be broadly defined as the ability to 
offload and reload the database to 
and from a reliable medium. This 
activity is an absolutely crucial as¬ 
pect of any system, and provides 
an application with a necessary lev¬ 
el of security against hardware er¬ 
rors and data corruption. You should 
examine backup and recovery from 
the standpoint of functionality, re¬ 
liability, and performance. Specific 


issues to consider include: 

□ Does the product provide 
utilities for backup and recovery 
or does it rely solely on operating 
system facilities? 

□ How and when is a full 
backup normally performed? To 
which type of archive medium can 
it be dumped (tape, disk, combina¬ 
tion, and so on)? 

□ When backing up a data¬ 
base, how do you switch from one 
device to another when necessary? 

□ Are online backups (back¬ 
ups while the transaction system is 
active) supported and, if so, what 
is the implementation and its 
limitations? 

□ Is database mirroring sup¬ 
ported? How does it come into play 
during backup and recovery cycles? 

□ Is there a fast dump utility 
(data import and export) you can 
use in lieu of the normal backup 
and recovery facilities? 

□ Are the backup and recov¬ 
ery tools tailored to address the 
skill levels of different users (nov¬ 
ice, experienced, and so on)? 

□ Can a partial backup be 
performed or is a full backup al¬ 
ways required? 

□ Does the product support 
incremental backups? 

□ Can multiple databases be 
backed up on the same machine in 
parallel? 

□ What performance measure¬ 
ments have been demonstrated by 
the product for backup and recov¬ 
ery (for example, time per MB, cost 
per MB)? 

Backup and recovery are of¬ 
ten overlooked as a significant is¬ 
sues in the selection process, and 
are often considered a necessary 
evil. The functionality and perfor¬ 
mance of these utilities (if they ex¬ 
ist) can have a great effect on the 
ultimate operations of the applica¬ 
tion and, therefore, should not be 
overlooked. 

JOURNALING 

Journaling is the ability to record 
database updates to a medium so 
they can be used for database re¬ 
covery in case data is lost or cor¬ 
rupted. Journaling is tightly inte¬ 
grated with the backup and 
recovery features of a product, as 
well as resilience to operating sys¬ 
tem and processor failures. In a 
typical scenario, when a large 
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amount of data is lost, you may re¬ 
cover from the last full backup 
and then apply the journal entries 
against the backup to bring the 
database up-to-date. When system 
failure occurs, the DBMS must check 
that all transactions in progress at 
the time of the failure have either 
been written to disk (if completed) 
or rolled back. Some key areas to 
explore include: 

□ Is transaction journaling re¬ 
quired for a database or DBMS sys¬ 
tem? Can it be turned on and off? 

□ Can journaling be turned 
on and off at the table level (for 
example, journaling for select ta¬ 
bles only)? 

□ Can tables be recovered 
from journals individually or must 
the entire database be recovered as 
a unit? 

□ How is journal manage¬ 
ment handled? Is a fixed number 
of journal files cycled and offload¬ 
ed as necessary, or is a set of files 
expanded as needed? 

□ Is duplex journaling sup¬ 
ported and, if so, how is it kept in 
sync? What is the journal file me¬ 
dium and what is it offloaded onto? 
(Obviously, the medium should be 
as reliable as possible.) 

□ Upon journal recovery, can 
all journal files be merged, apply¬ 
ing only the latest version of a giv¬ 
en record, rather than all interim 
updates to that record that may be 
in the journals? 

□ Is point-in-time recovery 
supported? Can you recover by user 
(all transactions except those ex¬ 
ecuted by a given user) or by table 
(selective restore)? 

□ What is stored in the jour¬ 
nals (for example, disk images or 
transactions to be reexecuted)? 

□ Are audit procedures in 
place to report on journal entries? 

□ How tunable is the jour¬ 
naling system (such as sizes and 
numbers of files, and so on)? 

□ What is the performance 
overhead associated with journal¬ 
ing? How has it been measured? 

□ Does the product have the 
ability to restore to a different 
physical medium or configuration 
(for example, in the case where a 
physical device must be changed)? 

□ After system failure, how 
is the database made consistent 
with the journals? Does the prod¬ 
uct support a warm or cold restart? 


What medium is 
used to house the 
database and 
journal files? 

Can you roll back the journals to 
the point of consistency and then 
backout and reissue the in-flight 
transactions? Can you roll back to 
a point in time? Are these features 
automatic, or can they be specified 
if necessary? 

□ What is the scenario for 
switching to stand-by copies of a 
database (local and remote)? 

□ How secure are the journal 
files and tapes? 

Many of the these issues may 
seem too technical to consider from 
an application standpoint; howev¬ 
er, you must understand them to 
design and plan effectively for a 
successful system. Journaling and 
system recovery can be very com¬ 
plex issues but, since they can de¬ 
termine the degree to which data 
is consistent and up-to-date, they 
cannot be ignored. 

MULTITIER RESTORATION 

As distributed processing solutions 
become more and more popular, 
distributed administration takes 
on a high degree of importance. 
Multitier restoration addresses the 
ability for a networked environ¬ 
ment to be treated as a single unit 
administratively. The following is¬ 
sues must be addressed in a dis¬ 
tributed manner: 

□ Does the product permit 
system start-up and shutdown in a 
coordinated fashion among multi¬ 
ple sites? 

□ How is database backup, 
recovery, and system restart han¬ 
dled in a tightly integrated, dis¬ 
tributed architecture? 

Many DBMS vendors will 
leave such complex issues as co¬ 
ordinated administration for dis¬ 
tributed environments to the hard¬ 
ware and network vendors. More 
and more, however, DBMS ven¬ 
dors are providing the networking 
software. The vendor who includes 
the capabilities to coordinate and 
administer such a complex environ¬ 
ment will have a major advantage 
over the competition. 


RELIABLE DATA STORAGE 

Obviously, you must feel comfort¬ 
able that any data an application 
stores in a database is stored in a 
safe and reliable place. More im¬ 
portantly, data that appears to 
have been committed in a transac¬ 
tion must actually be committed to 
the database. This concept is called 
reliable data storage. The impor¬ 
tant questions to ask include: 

□ What medium is used to 
house the database and journal 
files? How reliable and resilient is 
the medium to system crash? 

□ Does the transaction man¬ 
agement architecture guarantee that 
any committed transaction will ulti¬ 
mately be reflected in the database? 
(This issue may be especially impor¬ 
tant for systems that store databases 
using the UNIX file system as op¬ 
posed to raw disk slices. In this case, 
data can be written to system buff¬ 
ers, which may not make it to disk 
in the case of system crash.) 

In general, reliable data stor¬ 
age is simply a requirement for any 
transaction-based system that must 
guarantee accuracy to the last trans¬ 
action executed. If such security is 
not provided, the product is not a 
viable candidate for the system. 

A TRICKY BUSINESS 

As you can see from the many is¬ 
sues discussed over the past three 
months, DBMS selection can be a 
tricky business. I hope this series 
has provided you with new per¬ 
spectives on this decision that will 
help you make more knowledge¬ 
able choices in the future. I en¬ 
courage you to embellish and cus¬ 
tomize this list of issues by adding 
any key aspects of database science 
I may have overlooked or crucial 
product or vendor capabilities par¬ 
ticular to your application. As your 
DBMS evolves, remember that you 
must continue to update this list 
over time. 1111 
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REPOSITORY REPORT 


How to keep your repository straight with version control 


Y OU'VE JUST IN- 
stalled a brand-new 
application that has 
consumed your life 
for the last year. Us¬ 
ers love it because it satisfies all 
their requirements and requires no 
enhancements to meet the demands 
of your users' volatile business en¬ 
vironment. It has no bugs and in¬ 
tegrates effortlessly into your cor¬ 
poration's application portfolio. 
You're ready to face the challenge 
of developing yet another flawless 
application for your adoring busi¬ 
ness users. Ah, life is good. 

Then the phone rings, jolting 
you back from your daydreams of 
happy users and finished systems. 
Your user begs you to include just 
one more little enhancement in the 
release that is currently being test¬ 
ed and wonders if the subsequent 
release could possibly be installed 
a month sooner. Welcome back to 
the wonderful world of applica¬ 
tions maintenance. 

An application seems to take 
on a life of its own when the first 
release is installed into production. 
Users are quite adept at identify¬ 
ing additional data and functions 
that need to be supported. New 
technology and design concepts 
become available so fast that 
they've often become obsolete be¬ 
fore you've even had a chance to 
integrate them into your systems. 

The need to develop evolu¬ 
tionary applications that are archi¬ 
tected for flexibility and change is 
obvious. As Amdahl Corp. states 
when marketing Huron, its appli- 
cations-development product: "An 
application is always complete, but 
never finished." 

Techniques must be estab¬ 
lished to support the maintenance 
phase of the applications life cycle. 


BY TERRY MORI ARTY 

Managing 

Application 

Evolution 

Supporting change and version 
control may not be as fun and ex¬ 
citing as enterprise modeling and 
application design, but they are 
just as essential for successful in¬ 
formation resource management. 

Maintenance is the process of 
implementing changes and en¬ 
hancements throughout the life of 
an application. Version control is a 
set of procedures and techniques 
that let multiple incarnations of 
the same application component ex¬ 
ist simultaneously as the applica¬ 
tion evolves over time. 

Version control has two lev¬ 
els: application component and ap¬ 
plication release. An application 
component is an identifiable con¬ 
stituent of an application. Exam¬ 
ples include entity , business function, 
program, database, data element, and 
test case. In addition, an applica¬ 
tion component is documented as 
a repository object. 

An application release, on the 
other hand, is an identifiable pack¬ 
age of related application compo¬ 
nents that has been or is intended 
to be implemented. Multiple re¬ 
leases of the same application may 
be in development and production 
at the same time. 

Application version control 
starts at the application component 
level. When an application compo¬ 
nent is created, its corresponding 


repository object is assigned a ver¬ 
sion or revision number of zero. 
Whenever a change is made to the 
repository object, the version num¬ 
ber increases by one. The unique 
name of the repository object con¬ 
sists of the name assigned by the 
repository user plus a specific ver¬ 
sion number. When the repository 
object is referred to by name only, 
the version with the highest ver¬ 
sion number is automatically ac¬ 
cessed. However, a specific ver¬ 
sion can be identified by simply 
including its version number with 
the repository object name. 

Consider the application com¬ 
ponent, customer, which describes 
an entity in the application's infor¬ 
mation model. Upon creation, it is 
assigned a version number of zero. 
As the definition of the applica¬ 
tion component evolves, new ver¬ 
sions are created, as illustrated in 
Figure 1. The name "customer" 
identifies CUSTOMER_REV.2. 
However, the other two versions 
can be accessed by specifically 
identifying CUSTOMER_REV.O or 
CUSTOMER_REV.l. 

GUIDE refers to this type of 
version control as "serial." Each 
subsequent revision of a repository 
object creates a new object based 



FIGURE 1. Repository object versions. 
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FIGURE 2. Incorporating the repository into the release library set. 


upon the prior version. Each revi¬ 
sion exists independently of the 
others, and the repository ensures 
that users access the correct ver¬ 
sion based on the version number 
included with the name. 

T he second level 

of version control is 
used to manage ap¬ 
plication releases. Re¬ 
lease version control 
is nothing new to applications de¬ 
velopers. Techniques have been 
implemented in most environments 
allowing new releases to be based 
upon prior releases, but evolve in¬ 
dependently of them. The illustra¬ 
tions of this function in this col¬ 
umn apply to the IBM mainframe 
environment, using Manager Soft¬ 
ware Product's DataManager reposi¬ 
tory. However, the concepts apply 
regardless of the technology used 
to implement the applications- 
development environment (ADE). 

When the development of a 
new release commences, an envi¬ 
ronment is established through 
which the new versions of the ap¬ 
plication components are main¬ 
tained. In the IBM MVS environ¬ 
ment, a set of related libraries is 
created, as shown in Figure 2. Nor¬ 
mally, application components are 
managed through libraries estab¬ 
lished for a specific purpose; that 
is, program source code and proc 
job control language (JCL) are main¬ 
tained in separate libraries. As Fig¬ 
ure 2 also illustrates, when the 
ADE is controlled through the re¬ 
pository, a repository partition 
must be created to house the re¬ 
pository object versions specific to 
the application release. With Data¬ 
Manager, the repository partition 
is implemented as a "status." 

While the application compo¬ 


nent resides in the appropriate li¬ 
brary, the corresponding documen¬ 
tation is maintained in the cor¬ 
responding repository partition. 
When the repository is used ac¬ 
tively in production, the documen¬ 
tation in the repository is used to 
generate the actual application 
component, as illustrated in Figure 
3. For example, a CASE tool uses 
the data structure Group Y, which 
is documented in the release's re¬ 
pository partition, to generate Y's 
COBOL data definition. The CO¬ 
BOL statements are stored as a 
member in the release's Copylib. 

Likewise, the ADE's code gen¬ 
erator should use the logic docu¬ 
mented in Module X's minispec, 
which is maintained in the reposi¬ 
tory to generate X's COBOL source 
code. This source code is stored in 
the release's Sourcelib. Additional 
CASE tools should be available to 
generate database data-definition 
statements, proc and JCL, and oth¬ 
er application components. The 


libraries in a library set evolve 
through the development life cy¬ 
cle as a unit. 

So far, we've only defined 
the library set of one application 
release. Let's expand the concept 
to handle multiple releases. Most 
applications are developed to sup¬ 
port serial releases. The changes of 
the prior releases serve as the basis 
of a given release. Each release is 
intended to supersede the prior 
one. When application releases are 
serial, their library sets can be ar¬ 
ranged into a library stack, as 
shown in Figure 4. The base li¬ 
brary set contains all the applica¬ 
tion components for the entire ap¬ 
plication. Initially, the base is the 
library set for the first release of 
the application, but as the applica¬ 
tion evolves, it corresponds to the 
library set for the application com¬ 
ponents currently installed in 
production. 

The library sets are placed in 
the stack in release order. A re¬ 
lease library contains only those 
application components that have 
changed from the prior version. A 
repository must maintain reposi¬ 
tory object versions across release 
partitions by identifying which 
versions are referenced by a spe¬ 
cific release. For example, when a 
change is made in the second quar¬ 
ter release of Module X, a new ver¬ 
sion (X_1) is created that is based 

on the production version. 

Module X is modified again 
in the third-quarter release. There¬ 
fore, version X_2 is based on ver- 



FIGURE 3. Interaction between repository and libraries. 
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FIGURE 4. Application release library sets. 

sion X_1 from the second-quarter 
release. The library sets are concat¬ 
enated, which allow a release to 
"see-thru" the library set stack and 
find the correct version of a re¬ 
pository object and its correspond¬ 
ing application component. For 
example, the third-quarter release 
references X_2 and Y_l, plus Z 
and B_1 from the first-quarter re¬ 
lease and the production versions 
of A and T. DataManager's Ad¬ 
vanced Status Facility provides 
support for the concatenation of 
repository partitions. 

In an active ADE, application 
components are always generated 
from a repository partition into 
another library in the same release 
library set. Code is never generat¬ 
ed from the third-quarter release 
repository partition into the first- 
quarter release Sourcelib. 

HEN A RELEASE IS 
installed into produc¬ 
tion, its library set is 
merged with the cur¬ 
rent production li¬ 
braries. At this time, the version of 
the application components and 
repository objects maintained in 
the release's library set replace the 
older versions residing in the pro¬ 
duction-equivalent libraries. Like¬ 
wise, new application components 
introduced into the new release 
are added to the base set. For fall¬ 
back purposes, it's prudent to re¬ 
tain a copy of the production- 
equivalent library set as the archive 
version before merging it with the 
current release. 

Parallel version control allows 
several versions of a single appli¬ 
cation component to exist simulta¬ 
neously. However, these versions 


aren't meant to supersede one an¬ 
other. In the ideal world (in which 
the application is developed from 
an enterprise model), this type of 
version control probably isn't nec¬ 
essary. However, few corporations 
maintain this ideal application 
portfolio. Rather, legacy applica¬ 
tions must also be managed and 
quite often manifest the following 
complications: 

□ The same name is used to 
refer to entirely different applica¬ 
tion components; 

□ The same application com¬ 
ponent uses the same name, but 
has been implemented differently 
in separate applications. For ex¬ 
ample, ACCT-STATUS-CD may use 
numeric codes in one application, 
while another uses alphanumeric 
codes; 

□ Applications implement the 
same corporate definition for a re¬ 
pository object, but must stage the 
implementation of enhancements 
to that object independently; 

□ The same application has 


been customized to meet the needs 
of different customers; 

□ Enhancements are being 
made to the corporate definitions, 
which are not yet ready to be 
implemented. 

These situations require par¬ 
allel version control in which in¬ 
dependent library stacks can be 
maintained, as shown in Figure 5. 
Interrelated applications that inte¬ 
grate their release plans can share 
the same stack. A release stack can 
see-thru its own stack, but can't 
see-thru to other application's 
stacks. 

Version control is an impor¬ 
tant management tool that provides 
the flexibility necessary to manage 
the application portfolio's evolu¬ 
tion. A repository must provide 
support for versioning at both the 
repository object and release lev¬ 
els. Without version control, a re¬ 
pository's ability to control the 
ADE is severely limited. 

Next month, we'll explore the 
role the repository plays in change 
management. 1111 
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FOR MANAGERS ONLY 


Using different approaches to analyze historical data under hypothetical situations 


NE OF THE MORE 
intriguing aspects of 
data warehouse and 
DSS processing is the 
issue of rewriting his¬ 
tory at the detailed, atomic record 
level once it has been correctly en¬ 
tered into the warehouse. (No, I'm 
not suggesting fraud, and yes. I'm 
serious.) Once the data warehouse 
has been built and the detailed, 
atomic data has been carefully 
transformed and loaded from the 
operational environment, DSS an¬ 
alysts—especially from the market¬ 
ing and sales departments—have a 
strange habit of wanting to go back 
in time and change the way things 
actually happened. 

Typically, organizations want¬ 
ing to perform this type of analy¬ 
sis are the marketing and sales de¬ 
partments. DSS analysts in these 
organizations always ask such 
questions as, "If we had aligned 
the sales territories differently last 
year, how would sales have stacked 
up?" Or, "We want to consolidate 
sales territories, but what would 
have been the impact of this con¬ 
solidation in the past?" 

This type of "what if" analy¬ 
sis has two requirements: First, the 
past years' (or more properly, his¬ 
torical data in general) detailed 
sales data must have already been 
captured; and, second, it must be 
possible to use this detailed atomic 
data for determining how the ter¬ 
ritorial alliance would have affect¬ 
ed sales. In addition, the detailed 
sales data should have one or more 
"discriminators" (such as ZIP 
codes) to be able to identify how 
sales territories would have been 
aligned had past circumstances 
been different. Given the appro¬ 
priate historical and detailed data, 
you can answer these marketing 
and sales what if questions. But 
first, it's necessary to go back in 
time and "rewrite history." 


BY WILLIAM H. INMON 

Should m 
Rewrite 
History? 

The most straightforward ap¬ 
proach to this problem is to decide 
what the territorial changes should 
be and go back and adjust each of 
the detailed sales records with this 
new sales territory. However, while 
this approach is sraightforward, it 
isn't without its various problems. 
Some of the drawbacks to this ap¬ 
proach include: 

□ Depending on how you 
make the changes to the detailed 
data, some of the data may be de¬ 
stroyed forever. In our example, 
the original sales territories may 
be destroyed. Say the algorithm 
for changing older territories is: 
"If territory=ABC and the corre¬ 
sponding ZIP codes are between 
12345 and 23456, then reset terri¬ 
tory to BCD." Depending on the 
alignment of territories and the 
discriminators used, it may be im¬ 
possible to return the detailed data 
to its original state. 

□ This method is costly in 
resources; not only for the initial 
modification of the territories, but 
for each successive iteration of 
what if analysis as well. If DSS an¬ 
alysts will be changing their minds 
frequently, then the resources in¬ 
herent to this approach can be 
intimidating. 

□ For a variety of reasons 
(including those already men¬ 
tioned), this approach doesn't lend 
itself to iterative processing. 

Another approach is to iden¬ 
tify the discriminators in the de¬ 


tailed data warehouse atomic rec¬ 
ord that are independent of the 
sales territory code. Typically, ZIP 
codes are used for this purpose. 
Then, create an index on discrimi¬ 
nators) in the data warehouse. At 
this point, you may be thinking, 
wait a minute, this index is most 
likely going to be huge and will 
therefore require a lot of resources 
to build. This thinking is quite 
correct. 

An alternative strategy to 
simply creating an index on the 
discriminator is to write a program 
that summarizes detailed atomic 
sales records by discriminator and 
some unit of time (such as daily, 
weekly, monthly, and so forth) pri¬ 
or to the creation of the index. 
This process is called "preconden¬ 
sation" or "preconditioning the 
data." For example, you could sum¬ 
marize sales data by ZIP code and 
month. 

This summarization process 
requires more work than simply 
creating an index (unless you hap¬ 
pen to have your sales data already 
sorted by discriminator and date). 
Ultimately, however, the precon¬ 
densed data ends up being more 
compact and much easier to index. 
When you have the sales data sum¬ 
marized by discriminator and unit 
of time, you can index relatively 
efficiently. 

In any case, whether you in¬ 
dex the sales data at the detailed 
level on randomly organized data, 
or precondense the data and then 
index it, you end up with detailed 
sales data indexed by discriminator. 

Without question, indexing 
and precondensing data requires a 
fair amount of resources. Howev¬ 
er, two points should be made here. 
Rewriting detailed history in any 
form is a resource-intensive pro¬ 
cess. Moreover, whether the index 
is built first, or the data is precon¬ 
densed and then the index is built. 
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the data arrives at a state in which 
iterative analytical processing can 
be done as efficiently as possible. 

Another way to precondition 
sales data is to add a field called 
DUMMYORG to each sales record. 
As you create this DUMMYORG 
field, you can load it with the ini¬ 
tial territorial sales organization. 
In later processing, you can load 
the appropriate records with their 
"new" sales territory. Finally, you 
can create a new table called orga¬ 
nization history. The purpose of this 
table is to trace the changes of 
sales territory by time. Given any 
moment in time, the DSS analyst 
can determine what the sales orga¬ 
nization structure looked like by 
examining the contents of the or¬ 
ganization history table. The table 
has three essential elements: the 
organization identifier, from-to 
dates, and the discriminators be¬ 
longing to the sales organization 
during the time span identified by 
the from-to dates. Figure 1 shows 
an example of a sales and organi¬ 
zational history table. 

Upon creating the extra ta¬ 
bles and fields, marketing DSS an¬ 
alysts can have their cake and eat 
it too. The marketing department 
can analyze the sales as they really 
were or rearrange the sales districts 
by altering the contents of DUM¬ 
MYORG to reflect the proposed 
changes. By using the organization 
history table, marketing DSS ana¬ 
lysts can also return to any mo¬ 


in d ex 


ment in time and reconstruct the 
marketing territories as of any mo¬ 
ment in time. Such capabilities are 
the ultimate in rewriting history. 

In many ways, the chosen 
discriminator—usually ZIP code— 
serves as a "lowest common de¬ 
nominator." Using the discrimina¬ 
tor and other available data, DSS 
analysts can slice and dice data to 
their hearts' content. 

Of course, the approach sug¬ 
gested here has both pros and cons. 
If the data is summarized by dis¬ 
criminator and a certain date, no 
further analysis below the level of 
summarization can be completed. 
For example, if the data is summa¬ 
rized by month, weekly analysis is 
impossible. Or, if data is summa¬ 
rized by ZIP code, no possibility 
exists to analyze data by street. 

Summing data has its draw¬ 
backs, as is the case with any shift 
in granularity. On the other hand, 
summing sales data above the de¬ 
tailed level has major advantages. 
By summarizing sales data, the 
volume of data is often shrunk sig¬ 
nificantly. Therefore, the abilities 
to index data freely, access and 
analyze the data, and quickly cre¬ 
ate different iterations of analysis 
are enhanced by the reduction in 
data volume. As a result, designers 
must balance the expected use of 
the data with the resources con¬ 
sumed by it. 

Other considerations exist as 
well. For example, adding the 


DUMMYORG field may be a waste 
of space. If the DSS analyst wants 
to create a new organization only 
once, then the DUMMYORG field 
serves no real purpose. Only when 
the DUMMYORG field is loaded 
once and analyzed many times does 
it makes sense to create it. 

This column has addressed 
the issue of manipulating histori¬ 
cal data to reflect a different orga¬ 
nization and structure. If you think 
the trade-off is worth the cost of 
resources, then you can create and 
analyze "artificial historical data" 
fairly easily. However, you should 
keep the following rules of the 
road in mind: 

□ The original detailed data 
shouldn't be destroyed. 

□ Add dummy fields to ac¬ 
commodate an artificial structure. 

□ Restructuring the data 
should be considered when the ar¬ 
tificial organization is created once 
and analyzed frequently and itera¬ 
tively thereafter. 

□ Don't disturb basic dis¬ 
criminators once you identify them. 
(This rule is mandatory.) 

□ Summarize data up to the 
lowest common denominator prior 
to indexing in order to save con¬ 
siderable resources. 

□ An index is necessary to 
perform reorganization iteratively. 

□ If only a one-time analysis 
is desired, then a simple join of 
data into a working set may be a 
good answer. 

□ A historical record of orga¬ 
nizational unit by discriminator is 
useful. 

A final related question is, 
when should original detailed data 
in the warehouse be altered? For 
example, three years after the fact, 
is it ever appropriate to change the 
value of a sale, item sold, or loca¬ 
tion? As a rule, unless the detailed 
sales data has been incorrectly re¬ 
corded, it doesn't make sense to al¬ 
ter data. However, a few legitimate 
exceptions may exist. Consider 
what happens when the post of¬ 
fice changes existing ZIP codes. 
When this case occurs, it may make 
sense to go back and alter old ZIP 
codes to make them conform to 
newer classifications. 1111 

William H. Inmon, a database author and 
lecturer, works at Forest Rim Technology 
in Littleton, Colorado. 
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| item | date 

amount 
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FIGURE 1. Data structures for rewriting history. 
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BIRDS OF A FEATHER 


Organizing a productive users 


group entails a lot more than just getting together 


ASE IS A HANDY 
acronym that means 
Computer Assisted 
(or aided) System (or 
Software) Engineer¬ 
ing. However, not only is CASE 
misunderstood by many people, but 
its implementation also presents 
various difficulties. In fact, CASE 
implementation requires surmount¬ 
ing a host of barriers as well as 
overcoming corporate resistance. 
This dilemma is the major reason 
why CASE success stories are so 
few and far between: Just because 
a company manages to implement 
one CASE product successfully, it's 
not assured that the next project 
will be as successful, or even that 
CASE will be used again. Yet, in 
the few instances when CASE 
methods have been used success¬ 
fully and metrics existed to mea¬ 
sure the productivity differences 
in using them, the results have of¬ 
ten been dramatic. Therefore, peo¬ 
ple have come back again and again 
to take a stab at CASE. 

To help those braving the 
twisted path that leads to success¬ 
ful CASE project implementation, 
a group of us formed the San Fran¬ 
cisco Bay Area CASE Users Group 
(SFBACUG) in October 1989 in or¬ 
der to share information and ex¬ 
periences. The first meeting, pub¬ 
licized by the good auspices of 
Arthur Young (presently Ernst and 
Young, one of the biggest boosters 
of CASE users groups nationwide), 
was held at the California State 
Automobile Association building 
in San Francisco. Approximately 
150 people attended, and a core 
group was formed to create a board 
of directors. Probably because of a 
lack of good sense, I allowed my¬ 
self to be named president; a post 
I've held for two years now. And 


BY DAVID PLOTKIN 

A CASE 
for Users 
Groups 

the group has grown and learned 
a lot during these past two, infor¬ 
mative years. 

A lot of what we've learned 
has to do with running a techni¬ 
cally oriented users group. The 
first members of the board were 
volunteers but weren't quite sure 
what they were volunteering for. 
Since starting a new users group is 
a tremendous amount of work, not 
everyone who volunteered was up 
to the task, and a few changes took 
place in the ranks of the board 
during the first year. Therefore, 
it's important to define each office 
or position's tasks up front. An es¬ 
timate of the time demands of each 
job is also helpful because it makes 
members understand exactly what 
they are committing to before they 
volunteer. 

SFBACUG's board of directors 
consists of a president, vice presi¬ 
dent of membership, secretary, trea¬ 
surer, and vice president of pro¬ 
grams and events. The latter posi¬ 
tion is shared—it turns out that 
arranging six meetings a year takes 
quite a bit of time and effort. We've 
found that the best people for this 
job are consultants because they 
have lots of contacts in the indus¬ 
try, know how to go about bor¬ 
rowing room space, and are gener¬ 
ally well organized. Unfortunately, 
consultants are often extremely 


busy earning a living. Therefore, 
sharing the responsibility seems to 
work pretty well. 

T'S IMPORTANT NOT 
to forget the busy 
work. A tremendous 
number of tasks such 
as folding, stuffing, 
and mailing meeting flyers, dupli¬ 
cating and mailing newsletters 
(don't forget to appoint a newslet¬ 
ter editor), and so forth must be 
completed by someone in order for 
the group to function. It's possible 
to hire a commercial outfit to un¬ 
dertake this work, but we hired a 
responsible high school student 
(the treasurer's son) who was will¬ 
ing to do it. We pay him for his ef¬ 
forts, and we're very glad to have 
his help. 

Of course, the matter of mon¬ 
ey must also be considered. Dues 
to cover expenses and hold mean¬ 
ingful programs and agendas are 
essential. The value offered to 
members should be worth what 
you're charging, and you need to 
come up with a dues structure that 
isn't too onerous. For example, most 
of the members of the SFBACUG 
work in corporations. Therefore, it's 
far easier for them to request a sin¬ 
gle check once a year for member¬ 
ship than to pay on a per-meeting 
basis. CASE users groups in other 
cities structure the collection pro¬ 
cess differently based on the 
group's make-up, but we've found 
that a single yearly payment also 
makes it easier to estimate a bud¬ 
get accurately. 

Other sources of money are 
available, but you should be care¬ 
ful in these instances. For example, 
some of the larger tool vendors 
might be eager to make a presenta¬ 
tion to the group and would pay 



DATABASE PROGRAMMING & DESIGN 

73 





























for the privilege. In San Francisco, 
however, we haven't chosen to ap¬ 
proach vendors in this way, even 
though other CASE users groups 
have decided to take advantage of 
the opportunity. 

Most importantly, we had to 
decide what we wanted the SFBA- 
CUG to accomplish. One thing it 
isn't is a replacement for tool ven¬ 
dor users groups, which are groups 
of tool users who get together to 
discuss the ins and outs of that 
tool. Therefore, we haven't and 
don't intend to hold sessions on 
the various tricky ways to use enti¬ 
ty-relationship diagrams in IEW or 
IEF. Instead, we've concentrated 
on the general issues of using 
CASE. We've tried to provide pro¬ 
grams that will ease the way for 
our members, letting them know 
what to expect, explaining what to 
do about hurdles, and offering a 
way to learn from each others' 
experiences. 

Because many of our mem¬ 
bers are just starting out in CASE, 
some of our meetings have ad¬ 
dressed the "how to's" of actually 


One thing 
SFBACUG isn't 
is a replacement 
for tool vendor 
users groups 

using it. We have presented step- 
by-step guidelines for projects, 
and members who have completed 
successful (or not-so-successful) 
projects have shared their exper¬ 
iences. We've also discussed what 
methodologies are, how they af¬ 
fect an organization, and how to 
formulate one that works. 

Another series of topics has 
centered around how to convince 
an organization that CASE is a 
good investment. CASE is a new 
way of doing things, and many or¬ 
ganizations are resistant to large 
changes. Managing organizational 
change becomes important in han¬ 
dling this resistance. Measuring 
the changes that result in using 


CASE is also important. In some 
organizations, a gut-level feeling 
that CASE won't increase produc¬ 
tivity enough to warrant a large 
investment in equipment and train¬ 
ing pervades. The question then 
arises: How do you measure your 
productivity before and after you 
try CASE? What exactly do you 
measure? 

Quality is another hot issue 
in the group. Most people believe 
that CASE will improve the qual¬ 
ity of the developed system. How¬ 
ever, a lot of factors contribute to 
final quality, not the least of which 
is the rigor with which you follow 
the life cycle to develop a system. 
Quality improvement becomes 
more than just a passing interest to 
those seriously contemplating us¬ 
ing CASE to develop their next 
mission-critical system. 

The group also discusses gen¬ 
eral topics, such as the future of 
CASE (Darlene Brown of the Gart¬ 
ner Group gives a great presenta¬ 
tion on this subject), information 
on international CASE symposiums, 
and many other important issues. 


DB2 Performance Management 
& Capacity Planning 
Workshop with 
Dr. Boris E. Zibitsker 

the first hands-on workshop for MVS professionals, with or without 
prior DB2 experience, who want to effectively manage a DB2 
environment. 

Performance Analysts, Capacity Planners, System Programmers, 
and Technical Managers can take advantage of a powerful 
combination of THEORY and PRACTICE. 

During this workshop participants will: 

* design and create DB2 databases 

* create and run SQL queries 

* monitor performance using DB2 Performance Monitors 

* model DB2 applications’ performance using CRYSTAL,™ 
Performance Evaluator, and BEST/1 - MVS 

* learn how to plan for distributed databases 

This 5 day Workshop is held every month at our location In 
Deerfield, Illinois. Special arrangement can also be made for 
on-site workshops. 

Mark your calendar NOW for one of our spring workshops: 

March 2-6,1992 
April 6-10,1992 
May 4-8,1992 


For registration or 
more information call 
BEZ Systems, Inc., at 
(708) 940-1010. 

CJTVTTAi, Putomma EmkMior and BEST/1 • UV8 w trcfema/la of BOS 3y* *Um». Inc. 




CIRCLE 17 ON READER SERVICE CARD 


Write For Us 

We are looking for experienced professionals to write 
practical, how-to articles for Database Programming 
& Design . 

Future issues will include topics such as: 

• Improving relational and nonrelational 

DBMS performance 

• Database design and data modeling 

• Using PC DBMSs as front ends 

• CASE and life-cycle development aids 

• Applications development with productivity 

tools 

• DEC and UNIX databases 

• Distributed database, client/server, and co¬ 

operative processing architectures 

• DB2, SQL/DS, IMS, IDMS, Sybase, Supra, 

Oracle, and other database products 

If you have an idea, a draft, an article, or a request for 
writer’s guidelines, we want to hear from you! 

Theresa Rigney 

Database Programming & Design 
600 Harrison St. 

San Francisco, CA 94107 
(415) 905-2482 
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Because big names in the CASE in¬ 
dustry come to San Francisco fairly 
frequently, we've managed to get 
a few of them to speak on occa¬ 
sion, including Vaughn Merlyn of 
Ernst & Young. 

One thing that has really 
contributed to the success of our 
meetings is holding them in con¬ 
junction with CASE-related events. 
CASE World, DB/Expo, and Appli- 
CASE are large shows in the San 
Francisco Bay Area that draw big 
crowds. The promoters of these 
shows have been agreeable about 
providing meeting rooms, speak¬ 
ers from their roster, and some¬ 
times even meals after our meet¬ 
ings. They've also distributed our 
meeting announcements, which has 
saved us time and money. And, of 
course, our members received a 
discount for the show. In return, 
we've made our mailing list avail¬ 
able to them to help promote the 
conference. 

A network of CASE users 
groups spans across the nation and 
Canada, with groups located in 
most major cities. The Internation¬ 


al CASE Users Group (ICUG) is an 
umbrella organization that is at¬ 
tempting to create a more cohesive 
atmosphere among the individual 
CASE users groups. This cohesion 
would help make sharing informa¬ 
tion easier. Furthermore, many of 
the newer groups have the same 
questions that older groups have 
already tackled. Groups could also 
share information on speakers and 
handle referrals to local users 
groups. In addition, chapter starter 
kits are now available to new 
chapters. To obtain more informa¬ 
tion on ICUG, you're welcome to 
call (213) 688-5466. 

Unfortunately, ICUG's ambi¬ 
tious plan has been slowed by a 
lack of funds (corporations are 
now being solicited for contribu¬ 
tions) and the fact that many local 
CASE users groups don't see the 
need for an umbrella organization 
and affiliation with ICUG. I sus¬ 
pect, however, that the huge dis¬ 
count offered to members of affili¬ 
ated chapters attending the CASE 
Users Group Conference in Atlan¬ 
ta (October 12 to 14, 1992) will 


convince most chapters that affili¬ 
ation is a good idea. San Francisco 
was one of the first chapters to 
join ICUG, and we've already seen 
a benefit from sharing information 
with the groups in Los Angeles, 
Boston, Seattle, and Charlotte, 
North Carolina. 

Being a member of a CASE 
users group is a rewarding experi¬ 
ence. It helps you learn a lot, avoid 
a host of problems, meet new peo¬ 
ple, and hear about new job op¬ 
portunities. In San Francisco, we 
have almost 100 members, and are 
starting a big membership push to 
encourage the approximately 400 
people on our mailing list to join. 
If you use CASE and live in or 
near a big city, check out your lo¬ 
cal CASE users group. If you live 
near San Francisco, I'll see you at 
our next SFBACUG meeting. 1111 

David Plotkin is an information systems 
analyst working in data administration 
and modeling at Chevron Corp. in San 
Ramon, California. He is also the presi¬ 
dent of the San Francisco Bay Area 
CASE Users Group. For more information 
on SFBACUG, please call (510) 901-3528. 


MORGAN K A U F M A N N ANNOUNCES 


A Guide to Developing Client/Server SQL Applications 

Setrag Khoshafian (Portfolio Technologies, Inc.), Arvola Chan (Versant Object Technology), 
Anna Wong (Borland), and Harry Wong (Nomadic Systems) 

March 1992; ISBN 1-55860-147-3; approx 600 pages; paper; $32.95 


"I highly recommend this book... it gives excellent 
treatment to a broad range of database issues, and its 
coverage of database server products reflects the deep 
knowledge of the authors. The information in this book 
will save professionals many years in research and trial- 
and-error learning." 

Richard Finkelstein, Performance Computing 

"If you are moving to a client/server environment, this 
book is the place to start. This book provides a clear, 


readable description of the major client/servers in a 
single source. It ought to save you several months of 
research time at a minimum." 

Joe Celko, Consultant and Columnist for 
DB Programming and Design 

"This book is required reading for anyone interested in 
database client/server technology." 

Robert Orfali, 

Author of Client/Server Programming with OS/2 


This book is a practical introduction to DB manage¬ 
ment for applications programmers, DBA's and 
workstation users entering the client/server environ¬ 
ment. It includes a tutorial on DB management 
concepts, explained with standard SQL, and offers a 
comprehensive survey of client/server technology with 
case studies comparing the capabilities and interfaces 
of dominant database servers such as EE, RDB, Oracle 
and SQL Server. 


Includes: 

• Features and performance of commercial DB servers 

• Individual SQL dialects compared to the standard 

• Server administration requirements 

• Benchmarking techniques 

• SQL applications in heterogeneous environments 

• Application programming interfaces to the servers 

• Future directions for SQL Client/Server Architectures 


A Guide to Developing 

Client/Server 

SQL 

Applications 

Setrag Khoshafian 
Arvola Chan 
Anna Wong 
Harry K. T. Wong 



FORTHCOMING! 

A Companion to the 
SQL Standard 

Jim Melton and Alan Simon 

Fall 1992, ISBN 1-55860-245-3; approx 450pages; paper; $34.95 

A thorough guide and tutorial for SQL 2, 
co-authored by Jim Melton, Editor of the 
new emerging ANSI/ISO standard for 1992. 


- TO ORDER IN THE US or Canada, call toll ftee 1-800-745-7323 - 

Or, send check, money order or Amex, VISA or MC authorization (acct. no., name on card and expiration date) to Morgan Kaufmann, 2929 Campus Drive, Suite 260, Dept. CH, San Mateo, CA 94403. 
Include shipping & handling (US surface rate: $3.501st volume, $2.50 each additional: Inti surface rate; $6.501st volume, $3.50 each addt'l). CA residents add sales tax. Inti customers add 10% to 
your order. Fax orders to (415) 578-0672. Phone:415-578-9911. Email: morgan@unix.sri.com 
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Over 100 speakers! 


The Best Database Event in the World Gets Better. 


DATABASE WORLD 
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DATABASE LIBRARY 


A reference to help OS/VS programmers migrate to COBOL II 


ECHNICAL AUTHORS 
are possibly the most 
critical book review¬ 
ers. They demand 
that the writing be 
clear, correct, concise, and interest¬ 
ing. Well, I'm a technical author, 
and I've found all of these quali¬ 
ties in MVS COBOL II Power Pro¬ 
grammer's Desk Reference by David 
Shelby Kirk. Notwithstanding the 
silly Game Boy-esque title, this 
book is easily the best COBOL II 
technical book I've ever read. It's 
much more than a desk reference, 
and Kirk explains COBOL II topics 
in an intelligently organized, con¬ 
versational manner. Nearly all of 
COBOL II's critical elements are 
covered, including terms and con¬ 
cepts, benefits, OS/VS features un¬ 
available in COBOL II, modified 
OS/VS COBOL operations, new 
COBOL II options, debugging and 
efficiency considerations, compile, 
job control language (JCL) require¬ 
ments, CICS and DB2 issues, and 
more. 

Kirk has obviously been 
working with COBOL II for quite 
a while and has probably taught 
classes in the subject, so he knows 
how to present concepts and tech¬ 
nical material. He annotates all as¬ 
pects of the language throughout 
the book and liberally doles out 
seasoned, professional opinions 
and practical hands-on advice for 
which you would typically have to 
pay consultants huge amounts of 
money. 

Almost every aspect of CO¬ 
BOL II is covered first technically, 
then subjectively. The subjective 
discussions lean toward a multi¬ 
tude of important topics: practical 
usage, benefits, efficiency, pro¬ 
gramming clarity, trade offs, and 
so on. You'll rarely find this level 
of in-depth, qualitative coverage 
in the usual technical data process¬ 
ing books. For example, Kirk sug- 


The Many 
Facets of 
COBOL II 

gests placing all accumulators—or 
switches that affect a particular 
logic path—into a single 01-level 
entry and using the INITIALIZE verb. 
He then states, "I've been criti¬ 
cized by a few technicians for rec¬ 
ommending INITIALIZE since it gener¬ 
ates a few more machine instruc¬ 
tions than the proper number of 
MOVE statements. I mention it here 
because these technicians may also 
work at your shop. If machine cy¬ 
cles become more expensive than 
your time. I'll change my views, 
but I think the industry passed 
that mark two decades ago." 

The book is geared toward 
professional OS/VS programmers 
who are migrating their shops to 
COBOL II. The orientation of the 
chapters as well as their content 
make liberal use of comparisons 
between the traditional and mod¬ 
ern approaches to things. This ori¬ 
entation could be considered both 
a strength and a weakness of the 
book, however. If your shop sup¬ 
ports both COBOL and COBOL II 
programs, I can't think of a better 
way to present this information— 
programmers will undoubtedly be 
given OS/VS programs to main¬ 
tain; probably within the next 
three to five years. On the other 
hand, if your shop has completely 
done away with OS/VS technol¬ 
ogy, this material is superfluous. 

A possible downside to Kirk's 
approach concerns stylistic coding 
bias. Although he is careful to add 
disclaimers on many of the stylis¬ 
tic considerations throughout the 


text, new COBOL II users will pro¬ 
bably be influenced to code "a la 
David Kirk," which could lead to 
problems with COBOL II standards 
in some shops. However, you can't 
have your cake and eat it too. Read¬ 
ers pay authors for facts and opin¬ 
ions. And the opinions are prob¬ 
ably more important than the facts 
in the long run. If you have strong 
opinions that have served you and 
others successfully, you might as 
well "stick to your guns." Kirk 
doesn't ride the political fence in 
this book. Rather, he expresses the 
good, the bad, and the ugly of CO¬ 
BOL II. 

As a summary, Kirk presents 
COBOL II coding guidelines, which 
contain abbreviated versions of 
many of the previously mentioned 
concepts and techniques, as well 
as COBOL programs, related pub¬ 
lications, and references for CO¬ 
BOL II, compile options, and JCL. 

My only complaint with this 
book is its length—it's too short! 
Not counting the appendixes, the 
book is only 225 pages long. I could 
have benefited from more detailed 
examples in the DB2 and CICS sec¬ 
tions, and an all-encompassing sam¬ 
ple program or application would 
have tied the hundreds of COBOL 
II's detail aspects together. Instead, 
Kirk includes short yet clear fig¬ 
ures and examples. Picky, picky, 
picky. 

However, the strengths of this 
reference outweigh any weaknesses. 
If you're looking for a COBOL II 
book that qualitatively describes 
all facets of the language and helps 
you migrate your applications from 
OS/VS technology, then look no 
further. 

MVS COBOL II Power Program¬ 
mer's Desk Reference by David 
Shelby Kirk . QED Information Sci¬ 
ences Inc., 1991. ISBN 0-89435-305-5 , 
310 pages , $34.95. 

—Jonathan Sayles 
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New maintenance, editing, and data conversion tools spring into action 


GET THE BUGS OUT 
WITH ANTWORKS 

Miaco Corp. has released AntWorks, 
a small yet powerful editor for 
Oracle databases. The product in¬ 
creases productivity by eliminating 
the need for users to write code 
for many commonly used proce¬ 
dures. According to the company, 
AntWorks is an editor, browse util¬ 
ity, data transfer facility, table 
manager, and an alternative to SQL* 
Loader in a single product. In ad¬ 
dition, the product provides an in¬ 
tuitive, menu-driven interface for 
users to accomplish their database 
editing goals. 

The AntWorks editor is por¬ 
table and will run on most plat¬ 
forms in which Oracle resides. Pric¬ 
ing of AntWorks for a single DOS 
user starts at $275, with multiuser 
environments priced by platform 
and number of users. 

Miaco Corp., 6300 S. Syracuse 
Way, Ste. 415, Englewood, CO 80111, 
(303) 741-0381. 
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INTERSULV RELEASES 
DESIGN RECOVERY 

Intersolv has begun shipping ver¬ 
sion 1.0 of Design Recovery for 
DOS, a comprehensive, maintenance 
and redevelopment solution for PC- 
based developers. The product au¬ 
tomates the recapture, reanalysis, 
and redesign of existing applica¬ 
tions. With Design Recovery's ac¬ 
cess to information and its open 
desktop development environment, 
developers can analyze and docu¬ 
ment existing systems. 

By automatically scanning 
information into a centralized LAN- 
based repository. Design Recovery 
automates the capture of existing 


systems information into a CASE 
environment. From its repository, 
information for use in redevelop¬ 
ment can be accessed directly by 
Design Recovery's own analyzers, 
Intersolv's Excelerator analysis and 
design tool, or Intersolv's APS ap¬ 
plication generator. In addition, the 
product's open architecture pro¬ 
vides for exchange of data with 
other CASE tools via IBM's Exter¬ 
nal Source Format or the Electron¬ 
ic Industries Association-approved 
CASE Data Interchange Format. 

Design Recovery reclaims de¬ 
sign specification from code to 
produce high-level graphical re¬ 
presentations of data, process, and 
user interface information, which 
facilitates further understanding 
of an application. The product is 
able to capture individual programs, 
groups of programs, or entire sys¬ 
tems. Data is represented in data 
model diagrams, and screens and 
programs are represented in an 
"Invocation Diagram" format, pro¬ 
viding detail on program control 
as well as the data flow through¬ 
out a program, even if the pro¬ 
gram is unstructured. 

Design Recovery for DOS is 
priced at $6,500 per user. Minimum 
requirements include an IBM PC 
or compatible running DOS 3.3 or 
higher, 2MB memory, and 5MB free 
disk space for swapping. 

Intersolv, 3200 Tower Oaks Blvd., 
Rockville, MD 20852, (301) 230-3200. 
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THREE NEW RELEASES 
FROM BMC 

BMC Software Inc. has announced 
the general availability of three 
new products: Trimar DEDB Sec¬ 
ondary Index Facility, Opertune 
version 1.1 for DB2, and Copy Plus 


for DB2 version 3, release 1. 

Trimar DEBD Secondary In¬ 
dex Facility is BMC's latest addi¬ 
tion to its Trimar Fast Path Series 
and provides secondary indexes for 
data entry databases (DEDBs), giv¬ 
ing application programs an easy 
method for retrieving data from a 
DEDB in an order other than pri¬ 
mary key sequence. Users must no 
longer spend valuable time devel¬ 
oping their own programs for im¬ 
plementing DEDB secondary index¬ 
es. The product saves time and 
money, is easier to maintain, offers 
greater data integrity, and lets us¬ 
ers' applications be up and run¬ 
ning sooner than user-developed 
programs. 

Opertune version 1.1 for DB2 
allows for the dynamic modifica¬ 
tion of DB2 subsystems without 
interruption to users. These changes 
are categorized into parameter ele¬ 
ments or operational assist func¬ 
tions. The product has the ability 
to terminate runaway threads with¬ 
out cancelling a user's address 
space, which means that the user's 
IMS and CICS regions, as well as 
TSO user IDs, remain intact. Other 
operational assist functions include 
active log manipulation and the 
freeing of table spaces for utility 
maintenance. 

Opertune eliminates tuning 
restrictions through a technique 
that causes DB2 itself to make many 
of the requested changes upon de¬ 
mand. Significantly, these changes 
are accomplished without USER- 
MODs or dynamic hooks to DB2. 

Copy Plus for DB2 version 3, 
release 1 is a high-speed copy util¬ 
ity for DB2. The new version of¬ 
fers increased performance and 
new functionality, including the 
ability to make up to four concur¬ 
rent image copies. The product is 
designed and intended to supplant 
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IBM's DSN1COPY utility. 

Additional enhancements to 
the product include support for 
32,000 pages, multiple copy state¬ 
ments, and multidata set nonparti- 
tioned table spaces. BMC has also 
reduced virtual storage require¬ 
ments below the 16MB line. In ad¬ 
dition, the product includes table 
space validity checking, which en¬ 
sures that the copied pages have 
the correct internal formats. 

Trimar DEDB Secondary In¬ 
dex Facility is priced by CPU level 
and begins at $18,000 for a perpet¬ 
ual license. Prices for Opertune 
version 1.1 for DB2 start at $14,500 
for a perpetual license. Copy Plus 
is priced by CPU level and begins 
at $12,000. 

BMC Software Inc., P.O. Box 
2002, Sugar Land, TX 77487, (800) 
841-2031. 
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ENHANCED RDBMS/ 
4GL CAPABILITIES 

CompuServe Data Technologies has 
announced the immediate availabil¬ 
ity of Version 9 of System 1032, 
the company's RDBMS/4GL system 
for Digital Equipment Corp.'s VAX/ 
VMS family of computers. The Ver¬ 
sion 9 release expands System 
1032's data handling, user interface, 
and VMS integration capabilities. 
The COLLECT JOIN function enhances 
System 1032's ability to manipu¬ 
late massive databases by extend¬ 
ing its capabilities of joining mul¬ 
tiple related datasets (tables). The 
command window provides a new 
window-oriented user interface to 
access data more easily and flexi¬ 
bly. Additionally, Version 9 builds 
on System 1032's VMS integration 
with enhanced VMS services, mem¬ 
ory, and disk usage tools. 

Expanding on existing data¬ 
base join facilities and the union 
collections already in System 1032, 
the new join collections provide 
additional options for combining 
datasets into single logical records. 
Users can collect datasets that 
weren't originally designed to be 
part of a database view without 
performance implications. Once 
they have defined the collection's 
structure, users can pull in data 
from any dataset, previously relat¬ 
ed or not. Then, if they want to 


change the underlying datasets, us¬ 
ers declare the new collection cri¬ 
teria. Redefining the collection 
structure or recompiling applica¬ 
tions that access the collection is 
no longer necessary. 

Prices for the System 1032 
Version 9 product range from $6,000 
to $240,000, depending on CPU size 
across the entire VAX family. 

CompuServe Data Technologies, 
1000 Massachusetts Ave., Cambridge, 
MA 02138, (617) 661-9440. 
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XDB LINKS 
AND SERVES 

XDB Systems Inc. has released ver¬ 
sion 2.41 of XDB-Server, their SQL 
engine for OS/2, and XDB-Link, a 
new software package that lets PC 
applications access mainframe DB2 
data transparently. 

The XDB-Server is a high- 
performance, multiuser database 
system that brings the power and 
integrity of mainframe database 
systems to LANs. The core of the 
company's product line is the DB2 
compatible Engine. With the XDB- 
Server for OS/2, DB2 applications 
become portable to client/server 
platforms. Since the XDB-Server is 
network independent, DB2 com¬ 
patibility can exist on multiple 
platforms and operating systems. 

XDB-Link helps users devel¬ 
op cooperative processing applica¬ 
tions to use MIS resources more ef¬ 
ficiently. For example, applications 
and data that reside in distributed 
areas can be viewed as a single 
data center. 

The XDB-Link product con¬ 
sists of two components: XDB-Link 
Gateway, which resides on the PC, 
and the XDB-Link Host, which re¬ 
sides on the mainframe. These 
components communicate via LU6.2 
protocols to satisfy DB2 SQL re¬ 
quests originated from the PC. 

XDB-Link is based on a 
client/server architecture and sup¬ 
ports DOS, OS/2, and Windows cli¬ 
ents. In a development environ¬ 
ment, the product extends systems- 
level testing to the PC environment 
by providing access to mainframe 
production data. In a production 
environment, XDB's application 
tools and LAN applications can 
now access data residing in main- 
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frame DB2 as a database server. 

XDB-Server supports the Net- 
Bios and Named Pipes protocols 
and is priced at $2,495. XDB-Link 
is priced from $21,000 to $36,000, 
depending on CPU size. 

XDB Systems Inc., 14700 
Sweitzer Ln., Laurel, MD 20707, (301) 
317-6800. 
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MEET AT THE 
JUNCTION 

Tools & Techniques Inc. has re¬ 
leased Data Junction version 4.1, 
the company's data conversion tool. 
The product continues to exploit 
its "Hub & Spoke" architecture, 
letting users import and export 
their data files to and from virtual¬ 
ly any file format. The product also 
extends the ability to sort, extract, 
rearrange, and edit records, fields, 
and bytes into the exact output 
format required. Parameters gov¬ 
erning a Data Junction conversion 
can be selected interactively or set 
for automatic batch mode operation. 

The product now supports 
memo field formats from popular 
databases. Other enhancements in¬ 
clude improved support for var¬ 
ious DATE formats; the introduc¬ 
tion of "regular expressions" to 
perform Boolean extractions of rec¬ 
ords and search/replace of charac¬ 
ter strings; the ability to auto-align 
source and target fields by name; 
direct update of .dbf file structures 
at the record level; and scaling of 
numeric fields to assign decimal 
position. 

DOS licenses for Data Junc¬ 
tion 4.1 are priced at $99, $199, 
and $299 for standard, profession¬ 
al, and advanced levels respective¬ 
ly. UNIX and LAN licenses start at 
$400 for a four-user license. 

Tools & Techniques Inc., 1620 
W. 12th St., Austin, TX 78703, (512) 
482-0824. 
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PLAY IT AGAIN, 
DB2SAM 

Allen Systems Group Inc. has an¬ 
nounced the release of DB2 Securi¬ 
ty Access Manager (DB2SAM), a 
software package that enhances 
DB2's security facilities and lets 




















































sites organize and easily manage 
the DB2 security environment. This 
product also gives data processing 
organizations more control over 
DB2 security while delegating the 
tasks of individual "granting" and 
"revoking." 

Through its ability to main¬ 
tain and provide a comprehensive 
picture of all authorities and privi¬ 
leges, DB2SAM lets sites quickly 
locate and change any specific au¬ 
thorization entry. DB2SAM also 
helps sites keep control of security 
and prevent such situations as cas¬ 
cading DB2 authority. Other key 
features of the product include 
productivity in administering se¬ 
curity, audit trails and reporting, 
and ease of security administration. 

Prices for DB2SAM range 
from $21,600 to $31,700, depend¬ 
ing on CPU configuration. 

Allen Systems Group Inc., 750 
11th St. S., Naples, FL 33940, (813) 
263-6700. 
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STROBE LIGHTS 
THE WAY 

Programart Corp. has announced 
the availability of Strobe Release 
8.5, an application performance 
measurement software product for 
the IBM mainframe environment. 
Release 8.5, which incorporates 
new CICS, IDMS, COBOL, and 
PL/1 features, can help developers 
and database professionals improve 
application performance by rapid¬ 
ly pinpointing where and how ap¬ 
plications spend their time. 

According to the company, 
the new release significantly ex¬ 
pands the depth of Strobe's ability 
to attribute resource consumption 
in language library and system ser¬ 
vice routines to specific parts of 
the application. Without this capa¬ 
bility, programmers and technicians 
often waste time and resources try¬ 
ing to determine how application 
performance can be improved. 

Strobe also measures the per¬ 
formance of online and batch pro¬ 
cessing applications. It takes "snap¬ 
shots" of resource utilization while 
applications are running and pro¬ 
duces performance profile reports, 
which detail specific areas of con¬ 
centrated resource consumption. 
Used in each phase of the applica¬ 


tion life cycle. Strobe can cut the 
cost of running applications, rap¬ 
idly diagnose causes of performance 
crises, and maximize productivity. 

The new Strobe IDMS feature 
includes three new reports that 
provide measurement and evalua¬ 
tion data to help application pro¬ 
grammers and technicians identify 
inefficient or unnecessary use of 
system resources. Its new COBOL 
and PL/1 features include library 
attribution, making Strobe the first 
product capable of identifying the 
source program statements that 
implicitly call COBOL and PL/1 
compiler library subroutines. 

Prices range from $18,000 to 
$80,000, depending on CPU size. 

Programart Corp., University 
Place, 124 Mt. Auburn St., Cambridge, 
MA 02138, (617) 661-3020. 
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EVERYONE'S TALKING 
NATURAL LANGUAGE 

Natural Language Inc. has intro¬ 
duced Natural Language release 5.0, 
an English-based querying tool for 
relational databases that features 
an embedded, intuitive graphical 
user interface and an enhanced ap- 
plications-development environ¬ 
ment that reduces the deployment 
cycle. This release targets the desk¬ 
top market by supporting most 
desktop UNIX machines. Accord¬ 
ing to the company, the product 
lets users gain direct access to 
database information by asking for 
it in conversational English. 

Release 5.0 features a menu- 
driven interface that complements 
Natural Language's ad hoc English 
access capabilities. Graphics, table 
manipulation, data analysis, and 
reports are now accessible through 
pull-down menus. A browsing ca¬ 
pability offers a list of suggested 
questions that can be searched for 
by topic. Users can focus on the 
data they seek by browsing through 
the list of sample questions. 

Other features of Natural 
Language 5.0 include online help 
that informs users about the prod¬ 
uct's capabilities at the time they 
need them, and a more sophisticat¬ 
ed report generator that lets users 
manipulate tables and generate and 
save reports efficiently. The prod¬ 
uct also includes a file manager 


that lets users integrate informa¬ 
tion with their most familiar tools, 
including Lotus's 1-2-3, Microsoft's 
Excel, and SAS. 

Natural Language Inc. and 
Systems & Computer Technology 
Corp. (SCT) have also announced 
the release of IntelliQuest, SCT's 
English access querying tool based 
on the Natural Language product. 
IntelliQuest provides users with 
direct access to data in SCT's Ban¬ 
ner Series of local government and 
higher education software products. 

Pricing for Natural Language 
release 5.0 starts at $4,500 for the 
lower-end desktop platforms. In¬ 
telliQuest is available from SCT, 
and pricing depends on the size of 
the institution or jurisdiction. 

Natural Language Inc., 2910 
Seventh St., Berkeley, CA 94710, (510) 
841-3500. 

Systems & Computer Technol¬ 
ogy Corp., 4 Country View Rd., Mal¬ 
vern, PA 19355, (215) 640-5137. 
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FORM A STRATEGIC 
INF0ALLIANCE 

Software Publishing Corp. has re¬ 
leased InfoAlliance 1.10 for OS/2, 
a client/server software product 
designed to let PC users on a net¬ 
work transparently access and ma¬ 
nipulate data located in disparate 
databases throughout the organi¬ 
zation. The product contains all of 
the easy-to-use forms design, re¬ 
port writing, information access, 
and integration features of the ori¬ 
ginally introduced InfoAlliance for 
OS/2 Presentation Manager prod¬ 
uct. Additional capabilities in this 
OS/2 version include access to Mi¬ 
crosoft SQL Server data, the ability 
to manipulate scanned images, sup¬ 
port for OS/2 Dynamic Data Ex¬ 
change, and 3GL triggers for pro¬ 
grammers who wish to add custom 
features with C libraries. 

A single InfoAlliance 1.10 
standalone workstation product 
for either OS/2 or Windows cli¬ 
ents is $495. A workgroup installa¬ 
tion with eight OS/2 or Windows 
clients and a single OS/2-based 
server is priced at $6,455. 

Software Publishing Corp., 3165 
Kifer Rd., Santa Clara, CA 95051, (408) 
450-7314. 

CIRCLE 110 ON READER SERVICE CARD 


MARCH 1992 

80 












All CASE vendors offer product. Only one offers tke BACHMAN vision. 



I 



BACHMAN delivers 

proven software 
productivity through 
Model Driven 
Development™ now. 





When you examine most CASE 
vendors, they end up looking very 
much alike. 

All except for BACHMAN. We offer 
you a unique vision of how your 
organization can move into the next 
century of applications and systems 
development with products and 
services available now. 

BACHMAN provides an architecture 
to dramatically increase productivity, 
supported by the best consulting 
and training services available. 

Discover BACHMAN Model Driven 
Development.™ 

Start sharing in the vision today. 


1-800-BACHMAN. 


BACHMAN 
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GVIDMS" VS DB2 



CA-IDMS 

DB2 

Industry Standards: 



Ansi SQL Support 

✓ 

✓ 

FIPS 

✓ 

✓ 

SAA 

✓ 

✓ 

Ansi Standard Domain Integrity Check Constraint 

✓ 


Portability: Platforms Supported... 



MVS: MVS/XA 

✓ 

✓ 

MVS/ESA Dataspace 

✓ 

✓ 

VSE 

✓ 


VSE/ESA 

✓ 


VM 

✓ 


PC-DOS 

✓ 


PC LAN 

✓ 


Fujitsu 

✓ 


Investment Protection: Runs Applications 

Without Change From... 



DL/I 

✓ 


IMS 

✓ 


VSAM 

✓ 

✓ 

TOTAL 

✓ 


Thousands Of Mission-Critical Production Applications 

✓ 


Continuous Processing Through Dynamic 



Configuration Support 

✓ 


Industrial-Strength Application Development 

✓ 


SQL Access To Existing DB’s 

✓ 


Integrated Relational SQL 



And Navigational Support In A Single DBMS 

✓ 


Client/Server Architecture 

✓ 

✓ 

Century Dates 

✓ 

✓ 

Integrated Dictionary 

✓ 


Transient Read 

✓ 


Referential Integrity 

✓ 

✓ 

Row Level Locking 

✓ 


Roll-Back For Complete Backward Recovery 

✓ 


Case Tool Support 

✓ 


Bulk Access 

✓ 


Commit Continue 

✓ 


Pseudo-Conversational SQL 

✓ 


SQL Timing Options: System-Maintained Clustering 

✓ 


Variable Page Sizes 

✓ 



© Computer Associates international, Inc., 711 Stewart Avenue, Garden City, NY 11530-47871-800-645-3003. 

All trade names referenced are trademarks or registered trademarks of their respective companies. 
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When you line up all 
the facts, it’s easy to 
see why CA-IDMS® is 
the leading, industri¬ 
al-strength DBMS 
thousands and thou¬ 
sands of clients rely 
on. 


CA90s 


Unlike DB2, 
CA-IDMS can handle 
all of your mission- 
critical applications. 

It’s supported by the 
most comprehensive 
architecture ever 
developed, CA90s: 
Computing 
Architecture 
for the 90s. 

And it has all of the 
advanced features 
and functionality you 
need today 

So call 1-800-645- 
3003 to arrange for 
an on-site briefing or 
for more information. 

Call now and get the 
facts. 

As you can see, they 
speak for themselves. 


QoMPinm 

/■SSOCfATTS 

Software superior by design, 




















































Information Handling Services 



puts the information into a system 
youneed... you can trust. 

VSINF'from Information Handing Services 


An Company 


Excellence is our Standard 

















IHS stands alone in time and cost control 
of technical information searches. 


25 years of 
information 
management 


Information Handling Services was founded in 
1959 by an engineer who was all too familiar 
with the problems of finding technical information 
and finding it fast. Since then, techniques devel¬ 
oped by IHS have revolutionized the process 
of information handling. Engineering, purchasing 
and procurement professionals have come to trust 
the IHS name. 


Putting the customer 
first has kept 
us in the lead 


As the recognized leader in information handling, 
IHS serves more than 6,000 customers in more 
than 65 countries around the world. It is 
important to note that more than 90% of these 
customers renew our services annually. Simply 
put, we have been time tested by the toughest 
critics... our users. 


Unequaled products and 
services at competitive 
prices bring you 
the greatest value 


The world’s largest source of vendor catalogs, 
industry standards, military and federal specifi¬ 
cations, federal government and regulatory data 
is backed by the largest and most personalized 
service organization available. Add our component 
pricing structure and you’ll realize a value that 
no other company can match. 


VSMF: the most 
comprehensive technical 
information system 


A value analysis of VSMF demonstrates unparal¬ 
leled performance in acquiring, organizing and 
disseminating current information. For example, 
it includes more than 66,000 catalogs representing 
more than 26,000 carefully selected manufacturers. 


Nothing can compare to VSMF. 
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IHS is your direct link to original 

f Thousands of phone 


data from major industry, 
government, and technical 
society sources. 



' A total of over 10 million feet of microfiche and 
j, microfilm is processed every day via high- 
I speed micrographics equipment. 
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calls and letters each 
month ensure updating 
^^of^ocumentSj 



Each day, more than 11,000 
microfilm cartridges and 13,000 
microfiche are shipped to VSMF 
users worldwide. 


New cartridges or 
fiche are received 
and immediately put 
to use by 
VSMF customers. 































































For procurement specialists, it provides easy 
access to alternate sources and speeds the way 
to finding the best possible product at the best 
possible price. For engineers, it dramatically 
reduces search time resulting in increased 
productivity. The illustration on the preceding 
page will help explain the VSMF system. 


VSMF is kept current 
with constant, 
regular updates 


We know that your crucial procurement and 
design decisions depend on current information 
So our staff of engineers and para-technicians 
work from nearly 200 CRT terminals linked to a 
high-performance data management system to 
track 9 million pages of current information. 


VSMF is easy to use 
because of unique 
cross-referencing 
capabilities 


The VSMF system allows you to quickly begin a 
search by product or subject description, vendor 
name, mil spec number, Qualified Products List 
(QPL), or other parameters. Through the use of 
either the printed index or TECH DATA™ online 
indexing, you can easily identify the appropriate 
cross-referencing code. This code will lead you 
immediately to related supplier catalogs, industry 
standards or government specifications you 
need to reference. 


Exclusive cross-reference locator code 
makes the VSMF system easy to use. 



Vendor catalog pages, military 
documentation, and industry 
standards are presented in 
complete, unedited format. 

Industry standards are indexed j 
by society, document number, 
and subject. 


Supplier name clearly marked on 
each catalog page. 


Photographs and drawings 
accompany data. 

Hard copies from reader/ 
printer available instantly 
on full-sized sheets. 





























































TECH DATA means 
less time searching 
for information, 
more time using it 



Unlike other online search programs which 
require a computer-trained information specialist, 
TECH DATA allows professionals to “key” in 
requests for technical data themselves. The user 
can then go directly to the correct film location 
in the VSMF service—or even order hard 
copies online. 

TECH DATA offers immediate access to 15 major 
technical and engineering databases. Industrial 
catalogs from more than 26,000 vendors. Over 
90% of the most commonly referenced domestic 
and international industry codes and standards. 
The world’s most comprehensive collection of 
unclassified U. S. military and federal documents. 
Bibliographic files such as NTIS, Inspec and 
Cincinnati Milacron’s Robotics. Full-text data¬ 
bases such as the Kirk-Othmer Encyclopedia of 
Chemical Technology. And a great deal more. 


VSMF makes more than 
9 million pages of vital 
information manageable 


The VSMF data base system is divided into five 
parts so there’s no wasted time spent going 
through irrelevant information. 


Part One is Vendor Catalogs 

with over 26,000 manufacturers 
represented in more than 66,000 
catalogs. 10,000,000 products and 
components are included. The 
information is presented in two 
formats: full catalog (cover-to-cover), 
or side-by-side presentation of like 
products to greatly simplify second 
sourcing and product comparison. The 
VSMF customer can choose either or 
both formats. Further, the customer 
can customize his VSMF system to 
include only those information 
segments that are required. 


Part Two is Industry 
Standards 

which includes more than 90% of the 
world’s most commonly used 
domestic, international and foreign 
national standards. 

AASHTO AREA 

AATCC ARI 

ACI ARINC 

AECMA ASA 

AFBMA ASAE 

A.G.A. ASCE 

AGMA ASHRAE 

AI ASME 

AIA ASNT 

AIA (NAS) ASQC 

AISC ASSE 

AIChE ASTM 

ANS AWS 

ANSI AWWA 

API BSI 


CAA 

ISO 

CCITT 

JIS 

CECC 

MPTH 

CGA 

MSS 

CSA 

NACE 

DIN 

NEMA 

ECMA 

NFPA 

EIA 

NFP(A) 

EPRI 

RTCA 

EURODOC 

SAE 

GPA 

SASO 

IEC 

SBAC 

IEEE 

SEMI 

IES 

SMACNA 

IPC 

TAPPI 

ISA 

UL 

And many more 


Part Three is Department of 
Defense Products 

in what is acknowledged 
industry-wide as the world’s largest 
collection of unclassified military 
specifications and standards. This 
information reflects more than 18 
years of collecting important military 
documents. 

Military Specifications 
and Standards 

Military Handbooks 
MS Drawings 
NASA Standards 

Naval Instructions 
and Directives 

Qualified Products Lists 
And much more... 


Part Four is 
Regulatory Services 

which has been developed to 
maintain awareness of, and help 
ensure compliance with constantly 
changing federal regulations. 
Information included in this database 
includes: 

Material Safety Data Sheets 
Final Safety Analysis Reports 
EPA Reports 
Railway R-l Reports 
Bioenvironmental Data 

Part Five is 

Government Documents 

which provides crucial information 
relating to federal agencies, 
government personnel requirements 
and government procurement 
schedules and regulations. 

FAA 

Federal Construction 
Regulations 

Federal Specifications 

VA 

NRC 

FPM 

Contractor Catalogs and 
Price Lists 

FSS Schedules 

Government Acquisition 
Regulations 












VSMFcan increase engineers’ 
productivity 20% 


Microfilm proven more 
efficient than 
hard-copy files 

» 0 

Traditional Search Time VSMF Search Time 
49.5 minutes 6 minutes 


A recent survey* showed engineers spend an 
average of 49.5 minutes on each technical 
data search, for a weekly average of more than 
7.4 hours. VSMF cuts average search time to 
6 minutes per search or .9 hours each week. In the 
long run, VSMF can save as much as $11,000 per 
year per engineer and boost productivity as 
much as 20%. 

* Survey of 1,326 engineers. Search time based on an average of nine technical data searches each 
week. Cost savings based on average engineering hourly wage (with benefits and overhead) of 
$34.50 and 50-week year. Hourly wage source: “Professional Income of Engineers"—American 
Association of Engineering Societies, 1983; average annual salary of $34,500 for electronic engineers 
ten years out of college. 


VSMF boosts efficiency 
organization wide, too 


R&D Drafting 



Engineering Purchasing 

VSMF saves time and money 
throughout your organization. 


VSMF was designed by an engineer who wanted 
more time to be an engineer; greater flexibility in 
locating current data; and readable hard copies 
of drawings and specs from which to work. But 
the resulting system brings the same benefits to 
other professionals in the organization. 

Anyone needing to procure, research, develop, 
design, manufacture or test products and 
systems according to the highest professional 
standards needs VSMF. 

Inter-departmental use of VSMF assures stan¬ 
dardized selection of parts and components and 
simplifies sourcing. So every department saves 
time and money and minimizes its error potential. 


VSMF reduces 
leadtime problems 


There’s no need for purchasing and procurement 
managers or engineers to reinvent the wheel. 
Current information on alternate sources can 
affect important decisions on make vs. buy or 
determining acceptable levels of quality improve¬ 
ment. VSMF helps in finding the best possible 
product at the best price and delivery. 

In fact, investing in VSMF can pay off other ways, 
too. By easing bottlenecks, speeding project 
completion, and cutting back on personnel 
requirements needed to maintain conventional 
datafiles. 


You don’t have to 
buy more than you 
need, either 


Not every organization needs 9 million pages of 
technical data. The VSMF collection is divided 
into smaller specialized data segments to provide 
a custom-tailored VSMF program at an afford¬ 
able price. Of course, your system can easily be 
expanded to meet future growth requirements. 









Rely on the industry’s 
most comprehensive 
Customer Service 
Network 



As a VSMF user, you’ll find out what hundreds of 
thousands of professionals already know: There 
is a Customer Service Network that you can call 
on at any time. Our Systems Service Representa¬ 
tives will install your VSMF data files, ensure the 
system is working properly and thoroughly train 
your staff and co-workers on an ongoing basis. 

Your local System Service Representative (SSR) 
will regularly review your information needs to 
make sure your system is serving you most 
effectively. And, of course, your SSR will continue 
to train current and new personnel as needed. 


Free instruction and 
support services via 
Extension 99 



We’re as close as your phone. Although your 
staff will receive complete hands-on instruction 
for the best use of the VSMF system, difficult 
searches may require immediate assistance. 

Whenever a technical search requires extra 
attention, a quick toll-free call to Extension 99 
will put the caller in touch with our Customer 
Service Network and personal help. 

Extension 99 offers direct contact with a technical 
information specialist who can guide the caller 
through a search step-by-step. Most problems are 
resolved immediately. The more difficult tech¬ 
nical searches are solved by Extension 99 
experts in no more than 24 hours. 


Person-to-Person 
Customer Service 



In addition to Extension 99, we maintain a 
specially trained Customer Service staff to handle 
any questions or complaints any time you call. 
They will verify a shipping date, expedite replace¬ 
ment of broken or missing cartridges, lost or 
misplaced indexes. And, they’ll help handle any 
repair problems with equipment leased from IHS. 

IHS is committed to the customers that depend 
on our services. So we invest millions of dollars 
each year in personnel, training, service calls 
and systems to keep the IHS Customer Service 
Network the most reliable in the industry. No one 
else comes close. 

When we say “Excellence Is Our Standard,” you 
can believe us. We know it’s your standard, too. 


































An IHS exclusive... On-going 
Technical Information Evaluation 


A value-added feature 
you can’t get 
anywhere else, 
and it’s free 


VSMF users can request a detailed, personal¬ 
ized analysis of their information requirements 
at no charge. The Technical Information 
Evaluation—valued at $500—is a personally 
prepared confidential document for use solely by 
the customer and his Account Representative. 

Computer generated, the Technical Information 
Evaluation will clearly and concisely identify 
problems and growth areas that need information 
search assistance. It will also record the time 
and money-saving benefits of VSMF and recom¬ 
mend a future action plan. 



VSMF is the system of the future. 


Your business isn’t 
standing still, and 
neither is IHS 


In the past year, we have informed our customers 
of over 30 important new refinements to the 
VSMF system. We added indexing features and 
sources, and refined presentation formats to 
provide necessary information more efficiently. 
New products were introduced to make sure 
current information needs were met. This 
constant refinement reflects the continuing 
commitment we have to the clientele which has 
made us the leader in the technical information 
retrieval industry. That’s the only way to do 
business when excellence is your standard. 





















VSMF saves time, saves money. 
Take their word for it 


Exxon Co. 

General Motors Corp. 

Mobil Corp. 

International Business 
Machines Corp. 

General Electric Co. 

Atlantic Richfield Co. 

ITT Corp. 

R. J. Reynolds 
Industries, Inc. 

Eastman Kodak Co. 

The Dow Chemical Co. 

Westinghouse Electric Corp. 

The Boeing Co. 

The Goodyear Tire & 
Rubber Co. 

Xerox Corp. 

PepsiCo, Inc. 

Rockwell International Corp. 

McDonnell Douglas Corp. 

3M 

The Coca-Cola Co. 
W. R. Grace & Co. 


Information Handling Services 

15 Inverness Way East 
Englewood, Colorado 80150 
(800)525-7052 
(In Colorado, call 1-790-0600) 

Excellence Is Our Standard. 


“When our engineers and technicians were 
conducting searches manually, we estimated 
it took 40 hours per week and $210,900 per year 
to find the data we needed. With VSMF it took 
20 hours per week and only $53,900. That’s a 
savings of $157,000 a year.” 

Hughes Aircraft Company 
Radar Systems Group 

“The supplier that supplied us a product last 
week had a unit price of $2,200. We found another 
supplier in the IHS catalog that had the same 
product for exactly half the same price. By the 
time we had purchased all we needed, it saved 
us $27,500.” 

Fort Gordon 

“IHS’ high standard of excellence is evidenced 
not only in the thorough refinement of new infor¬ 
mation products but in the professional demeanor 
and helpfulness of its representatives be it in 
person or by telephone through “Extension 99.” 

The VSMF system has clearly saved our Company 
time, space, and money in the research as well 
as the procurement functions by allowing our 
technically-oriented staff to view current industry- 
related standards as well as side-by-side 
comparisons of manufacturers’ catalogs/products 
in an up-to-date and efficiently manageable 
format.” 

Florida Power & Light Company 











